Toggle navigation
OpenReview
.net
Login
×
Back to
ICLR
ICLR 2025 Workshop BuildingTrust Submissions
HLogformer: A Hierarchical Transformer for Representing Log Data
ICLR 2025 Workshop BuildingTrust Submission16 Authors
04 Feb 2025 (modified: 06 Mar 2025)
Submitted to BuildingTrust
Readers:
Everyone
Learning Automata from Demonstrations, Examples, and Natural Language
Marcell Vazquez-Chanlatte
,
Karim Elmaaroufi
,
Stefan Witwicki
,
Matei Zaharia
,
Sanjit A. Seshia
Published: 05 Mar 2025, Last Modified: 10 Apr 2025
BuildingTrust
Readers:
Everyone
Black-Box Adversarial Attacks on LLM-Based Code Completion
Slobodan Jenko
,
Niels Mündler
,
Jingxuan He
,
Mark Vero
,
Martin Vechev
Published: 05 Mar 2025, Last Modified: 15 Apr 2025
BuildingTrust
Readers:
Everyone
Privacy Auditing for Large Language Models with Natural Identifiers
ICLR 2025 Workshop BuildingTrust Submission13 Authors
03 Feb 2025 (modified: 06 Mar 2025)
Submitted to BuildingTrust
Readers:
Everyone
Mind the Gap: A Practical Attack on GGUF Quantization
Kazuki Egashira
,
Robin Staab
,
Mark Vero
,
Jingxuan He
,
Martin Vechev
Published: 05 Mar 2025, Last Modified: 25 Apr 2025
BuildingTrust
Readers:
Everyone
Interpretable Steering of Large Language Models with Feature Guided Activation Additions
Samuel Soo
,
Wesley Teng
,
Chandrasekaran Balaganesh
,
Tan Guoxian
,
Ming YAN
Published: 05 Mar 2025, Last Modified: 02 Apr 2025
BuildingTrust
Readers:
Everyone
Toward Trustworthy Neural Program Synthesis
ICLR 2025 Workshop BuildingTrust Submission10 Authors
02 Feb 2025 (modified: 06 Mar 2025)
Submitted to BuildingTrust
Readers:
Everyone
AdvBDGen: A Robust Framework for Generating Adaptive and Stealthy Backdoors in LLM Alignment Attacks
Pankayaraj Pathmanathan
,
Udari Madhushani Sehwag
,
Michael-Andrei Panaitescu-Liess
,
Furong Huang
Published: 05 Mar 2025, Last Modified: 23 Mar 2025
BuildingTrust
Readers:
Everyone
Language Models Use Trigonometry to Do Addition
Subhash Kantamneni
,
Max Tegmark
Published: 05 Mar 2025, Last Modified: 15 Apr 2025
BuildingTrust
Readers:
Everyone
Toward Trustworthy Difficulty Assessments: Large Language Models as Judges in Programming and Synthetic Tasks
ICLR 2025 Workshop BuildingTrust Submission7 Authors
31 Jan 2025 (modified: 06 Mar 2025)
Submitted to BuildingTrust
Readers:
Everyone
How Does Entropy Influence Modern Text-to-SQL Systems?
Varun Kausika
,
chris lazar
,
Satya Saurabh Mishra
,
Saurabh Jha
,
Priyanka Pathak
Published: 05 Mar 2025, Last Modified: 11 Apr 2025
BuildingTrust
Readers:
Everyone
UniGuard: Towards Universal Safety Guardrails for Jailbreak Attacks on Multimodal Large Language Models
ICLR 2025 Workshop BuildingTrust Submission5 Authors
28 Jan 2025 (modified: 06 Mar 2025)
Submitted to BuildingTrust
Readers:
Everyone
Dynaseal: A Backend-Controlled LLM API Key Distribution Scheme with Constrained Invocation Parameters
Jiahao Zhao
,
Fan Wu
,
南佳怡
,
魏来
,
Yang YiChen
Published: 05 Mar 2025, Last Modified: 06 Apr 2025
BuildingTrust
Readers:
Everyone
AntifakePrompt: Prompt-Tuned Vision-Language Models are Fake Image Detectors
You-Ming Chang
,
Chen Yeh
,
Wei-Chen Chiu
,
Ning Yu
Published: 05 Mar 2025, Last Modified: 04 Apr 2025
BuildingTrust
Readers:
Everyone
«
‹
1
2
3
4
5
6
›
»