Question 1

What is "From predictor to assistant" about?

Accepted Answer

After pre-training comes post-training: fine-tune on answers, then rank by human preference.

Question 2

What problem does it solve?

Accepted Answer

Trained only to predict the next word, a raw model answers “Write a poem about cats” by listing more prompts, not a poem. So why does a real chatbot actually help?

Question 3

What will I be able to do after this lesson?

Accepted Answer

You can explain post-training (SFT + RLHF) and why a raw next-word predictor becomes a helpful, instruction-following assistant.

Question 4

What comes next?

Accepted Answer

Helpful or not, it still predicts one token by looking across the whole sentence, let's open up how.

From predictor to assistant

Common questions