Toggle navigation
OpenReview
.net
Login
×
Back to
NeurIPS
NeurIPS 2024 Workshop ATTRIB Submissions
Investigating Language Model Dynamics using Meta-Tokens
NeurIPS 2024 Workshop ATTRIB Submission99 Authors
Published: 30 Oct 2024, Last Modified: 14 Jan 2025
ATTRIB 2024
Readers:
Everyone
Weak-to-Strong In-Context Optimization of Language Model Reasoning
NeurIPS 2024 Workshop ATTRIB Submission98 Authors
Published: 30 Oct 2024, Last Modified: 14 Jan 2025
ATTRIB 2024
Readers:
Everyone
Between the Bars: Gradient-based Jailbreaks are Bugs that induce Features
NeurIPS 2024 Workshop ATTRIB Submission97 Authors
Published: 30 Oct 2024, Last Modified: 14 Jan 2025
ATTRIB 2024
Readers:
Everyone
ReLU's Revival: On the Entropic Overload in Normalization-Free Large Language Models
NeurIPS 2024 Workshop ATTRIB Submission96 Authors
Published: 30 Oct 2024, Last Modified: 14 Jan 2025
ATTRIB 2024
Readers:
Everyone
Attributing Statistics to Synthesis Quality in Correlation-Based Texture Models
NeurIPS 2024 Workshop ATTRIB Submission94 Authors
Published: 30 Oct 2024, Last Modified: 14 Jan 2025
ATTRIB 2024
Readers:
Everyone
Generalization v.s. Memorization: Tracing Language Models’ Capabilities Back to Pretraining Data
NeurIPS 2024 Workshop ATTRIB Submission93 Authors
Published: 30 Oct 2024, Last Modified: 14 Jan 2025
ATTRIB 2024
Readers:
Everyone
Understanding Compute-Parameter Trade-offs in Sparse Mixture-of-Expert Language Models
NeurIPS 2024 Workshop ATTRIB Submission91 Authors
Published: 30 Oct 2024, Last Modified: 14 Jan 2025
ATTRIB 2024
Readers:
Everyone
Final-Model-Only Data Attribution with a Unifying View of Gradient-Based Methods
NeurIPS 2024 Workshop ATTRIB Submission90 Authors
Published: 30 Oct 2024, Last Modified: 14 Jan 2025
ATTRIB 2024
Readers:
Everyone
Inductive Linguistic Reasoning with Large Language Models
NeurIPS 2024 Workshop ATTRIB Submission89 Authors
Published: 30 Oct 2024, Last Modified: 14 Jan 2025
ATTRIB 2024
Readers:
Everyone
Bias Analysis for Unconditional Image Generative Models
NeurIPS 2024 Workshop ATTRIB Submission88 Authors
Published: 30 Oct 2024, Last Modified: 14 Jan 2025
ATTRIB 2024
Readers:
Everyone
Generalized Group Data Attribution
NeurIPS 2024 Workshop ATTRIB Submission87 Authors
Published: 30 Oct 2024, Last Modified: 14 Jan 2025
ATTRIB 2024
Readers:
Everyone
Better Counterfactual Model Reasoning with Submodular Quadratic Component Models
NeurIPS 2024 Workshop ATTRIB Submission86 Authors
Published: 30 Oct 2024, Last Modified: 14 Jan 2025
ATTRIB 2024
Readers:
Everyone
How much can we forget about Data Contamination?
NeurIPS 2024 Workshop ATTRIB Submission83 Authors
Published: 30 Oct 2024, Last Modified: 14 Jan 2025
ATTRIB 2024
Readers:
Everyone
When Attention Sink Emerges in Language Models: An Empirical View
NeurIPS 2024 Workshop ATTRIB Submission82 Authors
Published: 30 Oct 2024, Last Modified: 14 Jan 2025
ATTRIB 2024
Readers:
Everyone
$\Delta$-Influence: Unlearning Poisons via Influence Functions
NeurIPS 2024 Workshop ATTRIB Submission80 Authors
Published: 30 Oct 2024, Last Modified: 14 Jan 2025
ATTRIB 2024
Readers:
Everyone
Data Attribution for Multitask Learning
NeurIPS 2024 Workshop ATTRIB Submission79 Authors
Published: 30 Oct 2024, Last Modified: 14 Jan 2025
ATTRIB 2024
Readers:
Everyone
Quanda: An Interpretability Toolkit for Training Data Attribution Evaluation and Beyond
NeurIPS 2024 Workshop ATTRIB Submission78 Authors
Published: 30 Oct 2024, Last Modified: 14 Jan 2025
ATTRIB 2024
Readers:
Everyone
What do Learning Dynamics Reveal about Generalization in LLM Reasoning?
NeurIPS 2024 Workshop ATTRIB Submission77 Authors
Published: 30 Oct 2024, Last Modified: 14 Jan 2025
ATTRIB 2024
Readers:
Everyone
A Versatile Influence Function for Data Attribution with Non-Decomposable Loss
NeurIPS 2024 Workshop ATTRIB Submission76 Authors
Published: 30 Oct 2024, Last Modified: 14 Jan 2025
ATTRIB 2024
Readers:
Everyone
U-shaped and Inverted-U Scaling behind Emergent Abilities of Large Language Models
NeurIPS 2024 Workshop ATTRIB Submission73 Authors
Published: 30 Oct 2024, Last Modified: 14 Jan 2025
ATTRIB 2024
Readers:
Everyone
Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization
NeurIPS 2024 Workshop ATTRIB Submission72 Authors
Published: 30 Oct 2024, Last Modified: 14 Jan 2025
ATTRIB 2024
Readers:
Everyone
Understanding the Sources of Performance in Deep Drug Response Models
NeurIPS 2024 Workshop ATTRIB Submission71 Authors
Published: 30 Oct 2024, Last Modified: 14 Jan 2025
ATTRIB 2024
Readers:
Everyone
In Search of Forgotten Domain Generalization
NeurIPS 2024 Workshop ATTRIB Submission69 Authors
Published: 30 Oct 2024, Last Modified: 14 Jan 2025
ATTRIB 2024
Readers:
Everyone
Peter Parker or Spiderman? Disambiguating Multiple Class Labels
NeurIPS 2024 Workshop ATTRIB Submission68 Authors
Published: 30 Oct 2024, Last Modified: 14 Jan 2025
ATTRIB 2024
Readers:
Everyone
Just Select Twice: Leveraging Low Quality Data to Improve Data Selection
NeurIPS 2024 Workshop ATTRIB Submission67 Authors
Published: 30 Oct 2024, Last Modified: 14 Jan 2025
ATTRIB 2024
Readers:
Everyone
«
‹
1
2
3
›
»