Toggle navigation
OpenReview
.net
Login
×
Back to
ICML
ICML 2024 Workshop MI Submissions
ReLU MLPs Can Compute Numerical Integration: Mechanistic Interpretation of a Non-linear Activation
Chun Hei Yip
,
Rajashree Agrawal
,
Jason Gross
Published: 24 Jun 2024, Last Modified: 31 Jul 2024
ICML 2024 MI Workshop Poster
Readers:
Everyone
Localizing Auditory Concepts in CNNs
Pratyaksh Gautam
,
Makarand Tapaswi
,
Vinoo Alluri
Published: 24 Jun 2024, Last Modified: 31 Jul 2024
ICML 2024 MI Workshop Poster
Readers:
Everyone
Interpreting Attention Layer Outputs with Sparse Autoencoders
Connor Kissane
,
Robert Krzyzanowski
,
Joseph Isaac Bloom
,
Arthur Conmy
,
Neel Nanda
Published: 24 Jun 2024, Last Modified: 31 Jul 2024
ICML 2024 MI Workshop Spotlight
Readers:
Everyone
Penzai + Treescope: A Toolkit for Interpreting, Visualizing, and Editing Models As Data
Daniel D. Johnson
Published: 24 Jun 2024, Last Modified: 31 Jul 2024
ICML 2024 MI Workshop Spotlight
Readers:
Everyone
How Do Transformers Fill in the Blanks? A Case Study on Matrix Completion
Pulkit Gopalani
,
Ekdeep Singh Lubana
,
Wei Hu
Published: 24 Jun 2024, Last Modified: 31 Jul 2024
ICML 2024 MI Workshop Poster
Readers:
Everyone
Dissecting Query-Key Interaction in Vision Transformers
Xu Pan
,
Aaron Philip
,
Ziqian Xie
,
Odelia Schwartz
Published: 24 Jun 2024, Last Modified: 31 Jul 2024
ICML 2024 MI Workshop Spotlight
Readers:
Everyone
Compact Proofs of Model Performance via Mechanistic Interpretability
Jason Gross
,
Rajashree Agrawal
,
Thomas Kwa
,
Euan Ong
,
Chun Hei Yip
,
Alex Gibson
,
Soufiane Noubir
,
Lawrence Chan
Published: 24 Jun 2024, Last Modified: 31 Jul 2024
ICML 2024 MI Workshop Spotlight
Readers:
Everyone
Modularity in Biologically Inspired Representations Depends on Task Variable Range Independence
Will Dorrell
,
Kyle Hsu
,
Luke Hollingsworth
,
Jin Hwa Lee
,
Jiajun Wu
,
Chelsea Finn
,
Peter E. Latham
,
Timothy Edward John Behrens
,
James C. R. Whittington
Published: 24 Jun 2024, Last Modified: 24 Jun 2024
ICML 2024 MI Workshop Poster
Readers:
Everyone
Does Editing Provide Evidence for Localization?
Zihao Wang
,
Victor Veitch
Published: 24 Jun 2024, Last Modified: 31 Jul 2024
ICML 2024 MI Workshop Poster
Readers:
Everyone
How Truncating Weights Improves Reasoning in Language Models
Lei Chen
,
Joan Bruna
,
Alberto Bietti
Published: 24 Jun 2024, Last Modified: 31 Jul 2024
ICML 2024 MI Workshop Poster
Readers:
Everyone
Mechanistic Interpretability of Binary and Ternary Transformer Networks
Jason Li
Published: 24 Jun 2024, Last Modified: 31 Jul 2024
ICML 2024 MI Workshop Poster
Readers:
Everyone
Learning Syntax Without Planting Trees: Understanding When and Why Transformers Generalize Hierarchically
Kabir Ahuja
,
Vidhisha Balachandran
,
Madhur Panwar
,
Tianxing He
,
Noah A. Smith
,
Navin Goyal
,
Yulia Tsvetkov
Published: 24 Jun 2024, Last Modified: 31 Jul 2024
ICML 2024 MI Workshop Spotlight
Readers:
Everyone
Controlling Large Language Model Agents with Entropic Activation Steering
Nate Rahn
,
Pierluca D'Oro
,
Marc G Bellemare
Published: 24 Jun 2024, Last Modified: 31 Jul 2024
ICML 2024 MI Workshop Poster
Readers:
Everyone
How Do Transformers "Do" Physics? Investigating the Simple Harmonic Oscillator
Subhash Kantamneni
,
Ziming Liu
,
Max Tegmark
Published: 24 Jun 2024, Last Modified: 24 Jun 2024
ICML 2024 MI Workshop Poster
Readers:
Everyone
Crafting Large Language Models for Enhanced Interpretability
Chung-En Sun
,
Tuomas Oikarinen
,
Tsui-Wei Weng
Published: 24 Jun 2024, Last Modified: 31 Jul 2024
ICML 2024 MI Workshop Poster
Readers:
Everyone
Identifying Functionally Important Features with End-to-End Sparse Dictionary Learning
Dan Braun
,
Jordan Taylor
,
Nicholas Goldowsky-Dill
,
Lee Sharkey
Published: 24 Jun 2024, Last Modified: 31 Jul 2024
ICML 2024 MI Workshop Spotlight
Readers:
Everyone
Challenges in Mechanistically Interpreting Model Representations
Satvik Golechha
,
James Dao
Published: 24 Jun 2024, Last Modified: 31 Jul 2024
ICML 2024 MI Workshop Poster
Readers:
Everyone
Comgra: A Tool for Analyzing and Debugging Neural Networks
Florian Dietz
,
Sophie Fellenz
,
Dietrich Klakow
,
Marius Kloft
Published: 24 Jun 2024, Last Modified: 31 Jul 2024
ICML 2024 MI Workshop Spotlight
Readers:
Everyone
«
‹
1
2
3
4
›
»