Matrix × vector as a neural layer
The idea: A matrix multiply is a layer; it rotates / scales the vector.
What you'll be able to do: You can explain that a neural layer is a matrix that reshapes a vector.
The problem it solves: What is the 'feed-forward' block actually doing?
Builds on: Dot-product similarity
← Multi-head attention · Next: Residuals & LayerNorm →
All lessons