Question 1

What is "A real embedding has 768 numbers" about?

Accepted Answer

A real embedding is a long list of numbers (768 in GPT-2 small), not two; similarity is the same dot product, now hundreds of cells wide.

Question 2

What problem does it solve?

Accepted Answer

A flat 2-D map only has room for one nearest neighbour per word.

Question 3

What will I be able to do after this lesson?

Accepted Answer

You can explain that an embedding is a long list of numbers and that similarity is the same dot product, scaled up to hundreds of dimensions.

Question 4

What comes next?

Accepted Answer

A word is a long list of numbers. But how does raw text become the words we embed in the first place?

A real embedding has 768 numbers

Common questions