Transformers: A Simple Explanation | by François St-Amant | Dec, 2020

0
65

Why and what?

Attention (Self-Attention truly)

Source: http://jalammar.github.io/illustrated-transformer/

Scaled Dot-Product Attention

Source: https://arxiv.org/pdf/1706.03762.pdf

Multi-Head Attention

Source: https://arxiv.org/pdf/1706.03762.pdf

Architecture of the Transformer

Source: https://arxiv.org/pdf/1706.03762.pdf

Positional encoding

LEAVE A REPLY

Please enter your comment!
Please enter your name here