Toggle navigation
OpenReview
.net
Login
×
Back to
ICML
ICML 2024 Workshop MI Submissions
Weight-based Decomposition: A Case for Bilinear MLPs
Michael T Pearce
,
Thomas Dooms
,
Alice Rigg
Published: 24 Jun 2024, Last Modified: 31 Jul 2024
ICML 2024 MI Workshop Poster
Readers:
Everyone
Survival of the Fittest Representation: A Case Study with Modular Addition
Xiaoman Delores Ding
,
Zifan Carl Guo
,
Eric J Michaud
,
Ziming Liu
,
Max Tegmark
Published: 24 Jun 2024, Last Modified: 31 Jul 2024
ICML 2024 MI Workshop Poster
Readers:
Everyone
Information-Theoretic Progress Measures reveal Grokking is an Emergent Phase Transition
Kenzo Clauw
,
Daniele Marinazzo
,
Sebastiano Stramaglia
Published: 24 Jun 2024, Last Modified: 31 Jul 2024
ICML 2024 MI Workshop Poster
Readers:
Everyone
Interpretability in Action: Exploratory Analysis of VPT, a Minecraft Agent
Karolis Jucys
,
George Adamopoulos
,
Mehrab Hamidi
,
Stephanie Milani
,
Mohammad Reza Samsami
,
Artem Zholus
,
Sonia Joseph
,
Blake Aaron Richards
,
Irina Rish
,
Özgür Şimşek
Published: 24 Jun 2024, Last Modified: 31 Jul 2024
ICML 2024 MI Workshop Poster
Readers:
Everyone
Interpretability analysis on a pathology foundation model reveals biologically relevant embeddings across modalities
Nhat Le
,
Ciyue Shen
,
Chintan Shah
,
Blake Martin
,
Daniel Shenker
,
Harshith Padigela
,
Jennifer A. Hipp
,
Sean Grullon
,
John Abel
,
Harsha Vardhan pokkalla
,
Dinkar Juyal
Published: 24 Jun 2024, Last Modified: 31 Jul 2024
ICML 2024 MI Workshop Poster
Readers:
Everyone
Understanding Inhibition through Maximally Tense Images
Christopher J Hamblin
,
Srijani Saha
,
Talia Konkle
,
George A. Alvarez
Published: 24 Jun 2024, Last Modified: 31 Jul 2024
ICML 2024 MI Workshop Poster
Readers:
Everyone
Tackling Polysemanticity with Neuron Embeddings
Alex Foote
Published: 24 Jun 2024, Last Modified: 31 Jul 2024
ICML 2024 MI Workshop Poster
Readers:
Everyone
Representing Rule-based Chatbots with Transformers
Dan Friedman
,
Abhishek Panigrahi
,
Danqi Chen
Published: 24 Jun 2024, Last Modified: 31 Jul 2024
ICML 2024 MI Workshop Poster
Readers:
Everyone
An Adversarial Example for Direct Logit Attribution: Memory Management in GELU-4L
Jett Janiak
,
Can Rager
,
James Dao
,
Yeu-Tong Lau
Published: 24 Jun 2024, Last Modified: 31 Jul 2024
ICML 2024 MI Workshop Poster
Readers:
Everyone
Missed Causes and Ambiguous Effects: Counterfactuals Pose Challenges for Interpreting Neural Networks
Aaron Mueller
Published: 24 Jun 2024, Last Modified: 31 Jul 2024
ICML 2024 MI Workshop Oral
Readers:
Everyone
Using Degeneracy in the Loss Landscape for Mechanistic Interpretability
Lucius Bushnaq
,
Jake Mendel
,
Stefan Heimersheim
,
Dan Braun
,
Nicholas Goldowsky-Dill
,
Kaarel Hänni
,
Cindy Wu
,
Marius Hobbhahn
Published: 24 Jun 2024, Last Modified: 31 Jul 2024
ICML 2024 MI Workshop Spotlight
Readers:
Everyone
Manipulating Feature Visualizations with Gradient Slingshots
Dilyara Bareeva
,
Marina MC Höhne
,
Alexander Warnecke
,
Lukas Pirch
,
Klaus Robert Muller
,
Konrad Rieck
,
Kirill Bykov
Published: 24 Jun 2024, Last Modified: 31 Jul 2024
ICML 2024 MI Workshop Poster
Readers:
Everyone
Investigating the Interpretability of Biometric Face Templates Using Gated Sparse Autoencoders and Differentiable Image Parametrizations
Peter Rot
,
Klemen Grm
Published: 24 Jun 2024, Last Modified: 31 Jul 2024
ICML 2024 MI Workshop Poster
Readers:
Everyone
Relational Composition in Neural Networks: A Survey and Call to Action
Martin Wattenberg
,
Fernanda Viégas
Published: 24 Jun 2024, Last Modified: 31 Jul 2024
ICML 2024 MI Workshop Spotlight
Readers:
Everyone
Confidence Regulation Neurons in Language Models
Alessandro Stolfo
,
Ben Peng Wu
,
Wes Gurnee
,
Yonatan Belinkov
,
Xingyi Song
,
Mrinmaya Sachan
,
Neel Nanda
Published: 24 Jun 2024, Last Modified: 31 Jul 2024
ICML 2024 MI Workshop Poster
Readers:
Everyone
Do LLMs dream of elephants (when told not to)? Latent concept association and associative memory in transformers
Yibo Jiang
,
Goutham Rajendran
,
Pradeep Kumar Ravikumar
,
Bryon Aragam
Published: 24 Jun 2024, Last Modified: 31 Jul 2024
ICML 2024 MI Workshop Poster
Readers:
Everyone
Analyzing the Generalization and Reliability of Steering Vectors
Daniel Chee Hian Tan
,
David Chanin
,
Aengus Lynch
,
Adrià Garriga-Alonso
,
Dimitrios Kanoulas
,
Brooks Paige
,
Robert Kirk
Published: 24 Jun 2024, Last Modified: 31 Jul 2024
ICML 2024 MI Workshop Poster
Readers:
Everyone
Is Transformer a Stochastic Parrot? A Case Study in Simple Arithmetic Task
WANG PEIXU
,
Chen Yu
,
Yu Ming
Published: 24 Jun 2024, Last Modified: 31 Jul 2024
ICML 2024 MI Workshop Poster
Readers:
Everyone
InversionView: A General-Purpose Method for Reading Information from Neural Activations
Xinting Huang
,
Madhur Panwar
,
Navin Goyal
,
Michael Hahn
Published: 24 Jun 2024, Last Modified: 31 Jul 2024
ICML 2024 MI Workshop Oral
Readers:
Everyone
Understanding Counting in Small Transformers: The Interplay between Attention and Feed-Forward Layers
Freya Behrens
,
Luca Biggio
,
Lenka Zdeborova
Published: 24 Jun 2024, Last Modified: 31 Jul 2024
ICML 2024 MI Workshop Poster
Readers:
Everyone
From Alexnet to Transformers: Measuring the Non-linearity of Deep Neural Networks with Affine Optimal Transport
Quentin Bouniot
,
Ievgen Redko
,
Anton Mallasto
,
Charlotte Laclau
,
Oliver Struckmeier
,
Karol Arndt
,
Markus Heinonen
,
Ville Kyrki
,
Samuel Kaski
Published: 24 Jun 2024, Last Modified: 31 Jul 2024
ICML 2024 MI Workshop Poster
Readers:
Everyone
Transformers on Markov data: Constant depth suffices
Nived Rajaraman
,
Marco Bondaschi
,
Ashok Vardhan Makkuva
,
Kannan Ramchandran
,
Michael Gastpar
Published: 24 Jun 2024, Last Modified: 31 Jul 2024
ICML 2024 MI Workshop Poster
Readers:
Everyone
The Missing Curve Detectors of InceptionV1: Applying Sparse Autoencoders to InceptionV1 Early Vision
Liv Gorton
Published: 24 Jun 2024, Last Modified: 31 Jul 2024
ICML 2024 MI Workshop Spotlight
Readers:
Everyone
CoSy: Evaluating Textual Explanations of Neurons
Laura Kopf
,
Philine Lou Bommer
,
Anna Hedström
,
Sebastian Lapuschkin
,
Marina MC Höhne
,
Kirill Bykov
Published: 24 Jun 2024, Last Modified: 31 Jul 2024
ICML 2024 MI Workshop Poster
Readers:
Everyone
Why do recurrent neural networks suddenly learn? Bifurcation mechanisms in neuro-inspired short-term memory tasks
Udith Haputhanthri
,
Liam Storan
,
Yiqi Jiang
,
Adam Shai
,
Hakki Orhun Akengin
,
Mark Schnitzer
,
Fatih Dinc
,
Hidenori Tanaka
Published: 24 Jun 2024, Last Modified: 31 Jul 2024
ICML 2024 MI Workshop Poster
Readers:
Everyone
«
‹
1
2
3
4
›
»