Question 1

What is "RLHF / post-training" about?

Accepted Answer

Pretraining → SFT → RLHF: rank outputs to train a reward model, shift the policy.

Question 2

What problem does it solve?

Accepted Answer

A raw pretrained model isn't helpful or aligned.

Question 3

What will I be able to do after this lesson?

Accepted Answer

You can explain RLHF: human preferences train a reward model that shapes the assistant.