Boss: the ship gate
Each candidate change runs through a gate: an eval suite and a lethal-trifecta safety check. Then you call it: ship, fix, or block.
1Read the eval panel and the safety check, then make the call.your turn
candidate
1/3
candidate change
Auto-summarize support threads
Condense a ticket into a 2-line recap for the agent's queue.
eval gate5/5 pass
- billing dispute threadpass
- bug report with a stack tracepass
- angry refund demandpass
- feature request, emoji-heavypass
- multi-language threadpass
Clears the eval bar (4 of 5 needed).
safety check2/3 legs live
Touches private datalive
Reads untrusted inputlive
Has an exfiltration pathabsent
A leg is missing, so no complete exfiltration path.
your call
β continueβ backR replay
You can ship responsibly. Now look under the hood, at the hardware all of this runs on.
S.1 Why GPUs beat CPUsBuilds on7.3Evals: proving it works7.4LLM-as-a-judge7.7The lethal trifecta7.8Designing trustworthy AI features
Common questions
What is "Boss: the ship gate" about?
Each candidate change runs through a gate: an eval suite and a lethal-trifecta safety check. Then you call it: ship, fix, or block.
What problem does it solve?
You've watched AI features get traced, eval'd, judged, and attacked. Now: would you ship this one?
What will I be able to do after this lesson?
You can run an AI change through an eval gate and a lethal-trifecta safety check, then make a defensible ship/fix/block call.
What comes next?
You can ship responsibly. Now look under the hood, at the hardware all of this runs on.