Post

base02_Transformer

base02_Transformer

Encoder

位置编码

alt text

函数示意图:

alt text

word embedding 与对应位置函数值相加,得到位置编码


Self-Attention

alt text

  • 计算Q, K, V时,对于不同token,权重相同

Residual Connection

alt text

  • 能够快速的并行训练

Decoder

alt text

  • self-attention部分同Encoder;
  • 计算Q,K,V时,与Encoder的权重不同;

alt text

This post is licensed under CC BY 4.0 by the author.