Question 1

What is "Context window & KV cache" about?

Accepted Answer

Fixed context length; the KV cache reuses past keys/values for speed.

Question 2

What problem does it solve?

Accepted Answer

Long chats get slow / the model forgets.

Question 3

What will I be able to do after this lesson?

Accepted Answer

You can explain the KV cache and why long context is expensive.

Question 4

What comes next?

Accepted Answer

Attention is order-blind: how does it know word order?