Differentiable Faithfulness Alignment for Cross-Model Circuit TransferJan 1, 2026·Shun ShaoBinxu Wang,Shay B Cohen,Yonatan Belinkov· 0 min readTypeJournal articlePublicationunder review at COLM 2026Last updated on Jan 1, 2026Interpretability Mechanistic Interpretability Science of AI AuthorsBinxu WangResearch Fellow ← Where the Score Lives: A Wavelet View of Diffusion May 1, 2026Matching Accuracy, Different Geometry: Evolution Strategies vs GRPO in LLM Post-training Jan 1, 2026 →