Positional information
The idea: Inject position (sinusoidal / RoPE intuition).
What you'll be able to do: You can explain why models add positional information so word order matters.
The problem it solves: Attention is order-blind: 'cat drinks milk' = 'milk drinks cat'.
Builds on: Attention scores, softmax & the weighted sum
← Context window & KV cache · Next: Scaling laws →
All lessons