Mechanistic Interpretability
Mechanistic Interpretability Hub
What does a model actually compute when it predicts the next token? This hub maps the answer, through feature decomposition, activation patching, and circuit-level interventions on real models. Theory only counts when it runs.
Writing
Videos
Coming soon — check back after May 2026
Resources
Coming soon — check back after May 2026