Skip to content
See How AI Works
all lessons3.1●●○○○

Why context changes everything

Tokens must look at other tokens.

1A bigram sees only “drinks”. Which of these could fill the blank?

the cat drinks ___

drinks → ?(all a bigram sees)

watercoffeefastmilk

the catchmilk is right, but only if you know a cat is drinking, and that word sits two words back, out of a bigram's reach.

continue backR replay

How does a token decide what to look at?

3.2 Query, Key, Value
Architecture·

Common questions

What is "Why context changes everything" about?
Tokens must look at other tokens.
What problem does it solve?
To finish 'The cat drinks ___', a token must look back at 'cat'. Bigrams see only the previous word.
What will I be able to do after this lesson?
You can explain why a model must let words look at other words (attention).
What comes next?
How does a token decide what to look at?