Mechanistic Interpretability Hub

What does a model actually compute when it predicts the next token? This hub maps the answer, through feature decomposition, activation patching, and circuit-level interventions on real models. Theory only counts when it runs.

Coming soon — check back after May 2026

Coming soon — check back after May 2026