Question 1

What is "Quantization" about?

Accepted Answer

Store weights at lower precision; small accuracy cost, big memory win.

Question 2

What problem does it solve?

Accepted Answer

The model is too big to fit / serve.

Question 3

What will I be able to do after this lesson?

Accepted Answer

You can explain quantization: fewer bits per weight, big memory win, small quality cost.

Question 4

What comes next?

Accepted Answer

How far has this same machine been scaled?