Question 1

What is "Interpretability" about?

Accepted Answer

Explore real attention patterns / feature activations on curated inputs.

Question 2

What problem does it solve?

Accepted Answer

Can we see what's happening inside?

Question 3

What will I be able to do after this lesson?

Accepted Answer

You can explain interpretability: features inside a model track human concepts.