Loss as a scoreboard
The idea: Loss = surprise; lower is better; it is the game's score.
What you'll be able to do: You can explain what training loss measures: the model's surprise at the truth.
The problem it solves: Is this prediction good? By how much?
Builds on: The bigram model
← The bigram model · Next: Gradient descent: rolling downhill →
All lessons