OpenReview.net

Login

Back to ICML

ICML 2024 Workshop MI Submissions

Loading

About OpenReview
Hosting a Venue
All Venues

Contact
Sponsors
Donate

FAQ
Terms of Use / Privacy Policy
News

About OpenReview
Hosting a Venue
All Venues
Sponsors
News

FAQ
Contact
Donate
Terms of Use
Privacy Policy

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview

Weight-based Decomposition: A Case for Bilinear MLPs
Michael T Pearce, Thomas Dooms, Alice Rigg
- Published: 24 Jun 2024, Last Modified: 31 Jul 2024
- ICML 2024 MI Workshop Poster
- Readers: Everyone
Survival of the Fittest Representation: A Case Study with Modular Addition
Xiaoman Delores Ding, Zifan Carl Guo, Eric J Michaud, Ziming Liu, Max Tegmark
- Published: 24 Jun 2024, Last Modified: 31 Jul 2024
- ICML 2024 MI Workshop Poster
- Readers: Everyone
Information-Theoretic Progress Measures reveal Grokking is an Emergent Phase Transition
Kenzo Clauw, Daniele Marinazzo, Sebastiano Stramaglia
- Published: 24 Jun 2024, Last Modified: 31 Jul 2024
- ICML 2024 MI Workshop Poster
- Readers: Everyone
Interpretability in Action: Exploratory Analysis of VPT, a Minecraft Agent
Karolis Jucys, George Adamopoulos, Mehrab Hamidi, Stephanie Milani, Mohammad Reza Samsami, Artem Zholus, Sonia Joseph, Blake Aaron Richards, Irina Rish, Özgür Şimşek
- Published: 24 Jun 2024, Last Modified: 31 Jul 2024
- ICML 2024 MI Workshop Poster
- Readers: Everyone
Interpretability analysis on a pathology foundation model reveals biologically relevant embeddings across modalities
Nhat Le, Ciyue Shen, Chintan Shah, Blake Martin, Daniel Shenker, Harshith Padigela, Jennifer A. Hipp, Sean Grullon, John Abel, Harsha Vardhan pokkalla, Dinkar Juyal
- Published: 24 Jun 2024, Last Modified: 31 Jul 2024
- ICML 2024 MI Workshop Poster
- Readers: Everyone
Understanding Inhibition through Maximally Tense Images
Christopher J Hamblin, Srijani Saha, Talia Konkle, George A. Alvarez
- Published: 24 Jun 2024, Last Modified: 31 Jul 2024
- ICML 2024 MI Workshop Poster
- Readers: Everyone
Tackling Polysemanticity with Neuron Embeddings
Alex Foote
- Published: 24 Jun 2024, Last Modified: 31 Jul 2024
- ICML 2024 MI Workshop Poster
- Readers: Everyone
Representing Rule-based Chatbots with Transformers
Dan Friedman, Abhishek Panigrahi, Danqi Chen
- Published: 24 Jun 2024, Last Modified: 31 Jul 2024
- ICML 2024 MI Workshop Poster
- Readers: Everyone
An Adversarial Example for Direct Logit Attribution: Memory Management in GELU-4L
Jett Janiak, Can Rager, James Dao, Yeu-Tong Lau
- Published: 24 Jun 2024, Last Modified: 31 Jul 2024
- ICML 2024 MI Workshop Poster
- Readers: Everyone
Missed Causes and Ambiguous Effects: Counterfactuals Pose Challenges for Interpreting Neural Networks
Aaron Mueller
- Published: 24 Jun 2024, Last Modified: 31 Jul 2024
- ICML 2024 MI Workshop Oral
- Readers: Everyone
Using Degeneracy in the Loss Landscape for Mechanistic Interpretability
Lucius Bushnaq, Jake Mendel, Stefan Heimersheim, Dan Braun, Nicholas Goldowsky-Dill, Kaarel Hänni, Cindy Wu, Marius Hobbhahn
- Published: 24 Jun 2024, Last Modified: 31 Jul 2024
- ICML 2024 MI Workshop Spotlight
- Readers: Everyone
Manipulating Feature Visualizations with Gradient Slingshots
Dilyara Bareeva, Marina MC Höhne, Alexander Warnecke, Lukas Pirch, Klaus Robert Muller, Konrad Rieck, Kirill Bykov
- Published: 24 Jun 2024, Last Modified: 31 Jul 2024
- ICML 2024 MI Workshop Poster
- Readers: Everyone
Investigating the Interpretability of Biometric Face Templates Using Gated Sparse Autoencoders and Differentiable Image Parametrizations
Peter Rot, Klemen Grm
- Published: 24 Jun 2024, Last Modified: 31 Jul 2024
- ICML 2024 MI Workshop Poster
- Readers: Everyone
Relational Composition in Neural Networks: A Survey and Call to Action
Martin Wattenberg, Fernanda Viégas
- Published: 24 Jun 2024, Last Modified: 31 Jul 2024
- ICML 2024 MI Workshop Spotlight
- Readers: Everyone
Confidence Regulation Neurons in Language Models
Alessandro Stolfo, Ben Peng Wu, Wes Gurnee, Yonatan Belinkov, Xingyi Song, Mrinmaya Sachan, Neel Nanda
- Published: 24 Jun 2024, Last Modified: 31 Jul 2024
- ICML 2024 MI Workshop Poster
- Readers: Everyone
Do LLMs dream of elephants (when told not to)? Latent concept association and associative memory in transformers
Yibo Jiang, Goutham Rajendran, Pradeep Kumar Ravikumar, Bryon Aragam
- Published: 24 Jun 2024, Last Modified: 31 Jul 2024
- ICML 2024 MI Workshop Poster
- Readers: Everyone
Analyzing the Generalization and Reliability of Steering Vectors
Daniel Chee Hian Tan, David Chanin, Aengus Lynch, Adrià Garriga-Alonso, Dimitrios Kanoulas, Brooks Paige, Robert Kirk
- Published: 24 Jun 2024, Last Modified: 31 Jul 2024
- ICML 2024 MI Workshop Poster
- Readers: Everyone
Is Transformer a Stochastic Parrot? A Case Study in Simple Arithmetic Task
WANG PEIXU, Chen Yu, Yu Ming
- Published: 24 Jun 2024, Last Modified: 31 Jul 2024
- ICML 2024 MI Workshop Poster
- Readers: Everyone
InversionView: A General-Purpose Method for Reading Information from Neural Activations
Xinting Huang, Madhur Panwar, Navin Goyal, Michael Hahn
- Published: 24 Jun 2024, Last Modified: 31 Jul 2024
- ICML 2024 MI Workshop Oral
- Readers: Everyone
Understanding Counting in Small Transformers: The Interplay between Attention and Feed-Forward Layers
Freya Behrens, Luca Biggio, Lenka Zdeborova
- Published: 24 Jun 2024, Last Modified: 31 Jul 2024
- ICML 2024 MI Workshop Poster
- Readers: Everyone
From Alexnet to Transformers: Measuring the Non-linearity of Deep Neural Networks with Affine Optimal Transport
Quentin Bouniot, Ievgen Redko, Anton Mallasto, Charlotte Laclau, Oliver Struckmeier, Karol Arndt, Markus Heinonen, Ville Kyrki, Samuel Kaski
- Published: 24 Jun 2024, Last Modified: 31 Jul 2024
- ICML 2024 MI Workshop Poster
- Readers: Everyone
Transformers on Markov data: Constant depth suffices
Nived Rajaraman, Marco Bondaschi, Ashok Vardhan Makkuva, Kannan Ramchandran, Michael Gastpar
- Published: 24 Jun 2024, Last Modified: 31 Jul 2024
- ICML 2024 MI Workshop Poster
- Readers: Everyone
The Missing Curve Detectors of InceptionV1: Applying Sparse Autoencoders to InceptionV1 Early Vision
Liv Gorton
- Published: 24 Jun 2024, Last Modified: 31 Jul 2024
- ICML 2024 MI Workshop Spotlight
- Readers: Everyone
CoSy: Evaluating Textual Explanations of Neurons
Laura Kopf, Philine Lou Bommer, Anna Hedström, Sebastian Lapuschkin, Marina MC Höhne, Kirill Bykov
- Published: 24 Jun 2024, Last Modified: 31 Jul 2024
- ICML 2024 MI Workshop Poster
- Readers: Everyone
Why do recurrent neural networks suddenly learn? Bifurcation mechanisms in neuro-inspired short-term memory tasks
Udith Haputhanthri, Liam Storan, Yiqi Jiang, Adam Shai, Hakki Orhun Akengin, Mark Schnitzer, Fatih Dinc, Hidenori Tanaka
- Published: 24 Jun 2024, Last Modified: 31 Jul 2024
- ICML 2024 MI Workshop Poster
- Readers: Everyone

«
‹
1
2
3
4
›
»