Toggle navigation
OpenReview
.net
Login
×
Back to
NeurIPS
NeurIPS 2024 Workshop ATTRIB Submissions
Training on the Test Task Confounds Evaluation and Emergence
NeurIPS 2024 Workshop ATTRIB Submission65 Authors
Published: 30 Oct 2024, Last Modified: 14 Jan 2025
ATTRIB 2024
Readers:
Everyone
GRADE: A Fine-grained Approach to Measure Sample Diversity in Text-to-Image Models
NeurIPS 2024 Workshop ATTRIB Submission62 Authors
Published: 30 Oct 2024, Last Modified: 14 Jan 2025
ATTRIB 2024
Readers:
Everyone
Quantifying Positional Biases in Text Embedding Models
NeurIPS 2024 Workshop ATTRIB Submission60 Authors
Published: 30 Oct 2024, Last Modified: 14 Jan 2025
ATTRIB 2024
Readers:
Everyone
Feature Responsiveness Scores: Model-Agnostic Explanations for Agency
NeurIPS 2024 Workshop ATTRIB Submission59 Authors
Published: 30 Oct 2024, Last Modified: 14 Jan 2025
ATTRIB 2024
Readers:
Everyone
On Linear Representations and Pretraining Data Frequency in Language Models
NeurIPS 2024 Workshop ATTRIB Submission58 Authors
Published: 30 Oct 2024, Last Modified: 14 Jan 2025
ATTRIB 2024
Readers:
Everyone
Pruning-based Data Selection and Network Fusion for Efficient Deep Learning
NeurIPS 2024 Workshop ATTRIB Submission57 Authors
Published: 30 Oct 2024, Last Modified: 14 Jan 2025
ATTRIB 2024
Readers:
Everyone
Evaluating Sparse Autoencoders for Controlling Open-Ended Text Generation
NeurIPS 2024 Workshop ATTRIB Submission55 Authors
Published: 30 Oct 2024, Last Modified: 14 Jan 2025
ATTRIB 2024
Readers:
Everyone
Loss-to-Loss Prediction: Language model scaling laws across datasets
NeurIPS 2024 Workshop ATTRIB Submission53 Authors
Published: 30 Oct 2024, Last Modified: 14 Jan 2025
ATTRIB 2024
Readers:
Everyone
Hessian Sets: Uncovering Feature Interactions in Image Classification
NeurIPS 2024 Workshop ATTRIB Submission52 Authors
Published: 30 Oct 2024, Last Modified: 14 Jan 2025
ATTRIB 2024
Readers:
Everyone
Towards a Mechanistic Explanation of Diffusion Model Generalization
NeurIPS 2024 Workshop ATTRIB Submission51 Authors
Published: 30 Oct 2024, Last Modified: 14 Jan 2025
ATTRIB 2024
Readers:
Everyone
GPT-2 Through the Lens of Vector Symbolic Architectures
NeurIPS 2024 Workshop ATTRIB Submission49 Authors
Published: 30 Oct 2024, Last Modified: 14 Jan 2025
ATTRIB 2024
Readers:
Everyone
Evaluating Sparse Autoencoders on Targeted Concept Removal Tasks
NeurIPS 2024 Workshop ATTRIB Submission48 Authors
Published: 30 Oct 2024, Last Modified: 14 Jan 2025
ATTRIB 2024
Readers:
Everyone
Toward Optimal Search and Retrieval for RAG
NeurIPS 2024 Workshop ATTRIB Submission47 Authors
Published: 30 Oct 2024, Last Modified: 14 Jan 2025
ATTRIB 2024
Readers:
Everyone
Secret Seeds in Text-to-Image Diffusion Models
NeurIPS 2024 Workshop ATTRIB Submission45 Authors
Published: 30 Oct 2024, Last Modified: 14 Jan 2025
ATTRIB 2024
Readers:
Everyone
Adversarial Attacks on Data Attribution
NeurIPS 2024 Workshop ATTRIB Submission44 Authors
Published: 30 Oct 2024, Last Modified: 14 Jan 2025
ATTRIB 2024
Readers:
Everyone
Visual Reasoning Requires Rethinking Vision-Language Beyond Scaling
NeurIPS 2024 Workshop ATTRIB Submission43 Authors
Published: 30 Oct 2024, Last Modified: 14 Jan 2025
ATTRIB 2024
Readers:
Everyone
$\texttt{dattri}$: A Library for Efficient Data Attribution
NeurIPS 2024 Workshop ATTRIB Submission42 Authors
Published: 30 Oct 2024, Last Modified: 14 Jan 2025
ATTRIB 2024
Readers:
Everyone
Common Functional Decompositions Can Mis-attribute Differences in Outcomes Between Populations
NeurIPS 2024 Workshop ATTRIB Submission41 Authors
Published: 30 Oct 2024, Last Modified: 21 Jan 2025
ATTRIB 2024
Readers:
Everyone
Ablation is Not Enough to Emulate DPO: How Neuron Dynamics Drive Toxicity Reduction
NeurIPS 2024 Workshop ATTRIB Submission40 Authors
Published: 30 Oct 2024, Last Modified: 14 Jan 2025
ATTRIB 2024
Readers:
Everyone
Accumulated Local Effects for Link Prediction with Graph Neural Networks
NeurIPS 2024 Workshop ATTRIB Submission39 Authors
Published: 30 Oct 2024, Last Modified: 14 Jan 2025
ATTRIB 2024
Readers:
Everyone
Activation Monitoring: Advantages of Using Internal Representations for LLM Oversight
NeurIPS 2024 Workshop ATTRIB Submission36 Authors
Published: 30 Oct 2024, Last Modified: 14 Jan 2025
ATTRIB 2024
Readers:
Everyone
Decoding Dark Matter: Specialized Sparse Autoencoders for Interpreting Rare Concepts in LLMs
NeurIPS 2024 Workshop ATTRIB Submission35 Authors
Published: 30 Oct 2024, Last Modified: 14 Jan 2025
ATTRIB 2024
Readers:
Everyone
Efficient Ensembles Improve Training Data Attribution
NeurIPS 2024 Workshop ATTRIB Submission34 Authors
Published: 30 Oct 2024, Last Modified: 14 Jan 2025
ATTRIB 2024
Readers:
Everyone
The Association Between Training Data and Text-to-Image Generation Capabilities
NeurIPS 2024 Workshop ATTRIB Submission33 Authors
Published: 30 Oct 2024, Last Modified: 14 Jan 2025
ATTRIB 2024
Readers:
Everyone
Evolution of SAE Features Across Layers in LLMs
NeurIPS 2024 Workshop ATTRIB Submission32 Authors
Published: 30 Oct 2024, Last Modified: 14 Jan 2025
ATTRIB 2024
Readers:
Everyone
«
‹
1
2
3
›
»