Toggle navigation
OpenReview
.net
Login
×
Back to
COLM
COLM 2025 Workshop SoLaR Submissions
A Generative Approach to LLM Harmfulness Mitigation with Red Flag Tokens
Sophie Xhonneux
,
David Dobre
,
Mehrnaz Mofakhami
,
Leo Schwinn
,
Gauthier Gidel
Published: 25 Jul 2025, Last Modified: 12 Oct 2025
COLM 2025 Workshop SoLaR Poster
Readers:
Everyone
Multi-Turn Jailbreaks Are Simpler Than They Seem
Xiaoxue Yang
,
Jaeha Lee
,
Anna-Katharina Dick
,
Jasper Timm
,
Fei Xie
,
Diogo Cruz
Published: 25 Jul 2025, Last Modified: 12 Oct 2025
COLM 2025 Workshop SoLaR Poster
Readers:
Everyone
LLMs on Trial: Evaluating Judicial Fairness for Large Language Models
Yiran HU
,
Zongyue Xue
,
Haitao Li
,
Siyuan Zheng
,
Qingjing Chen
,
Shaochun Wang
,
Xihan Zhang
,
Ning Zheng
,
Yun Liu
,
Qingyao Ai
,
Yiqun LIU
,
Charles L. A. Clarke
,
Weixing Shen
Published: 25 Jul 2025, Last Modified: 12 Oct 2025
COLM 2025 Workshop SoLaR Poster
Readers:
Everyone
Sarc7: Evaluating Sarcasm Detection and Generation with Seven Types and Emotion-Informed Techniques
Lang Xiong
,
Raina Gao
,
Alyssa Jeong
,
Yicheng Fu
,
Kevin Zhu
,
Sean O'Brien
,
Vasu Sharma
Published: 25 Jul 2025, Last Modified: 12 Oct 2025
COLM 2025 Workshop SoLaR Poster
Readers:
Everyone
Investigating Model Editing for Unlearning in Large Language Models
Shariqah Hossain
,
Lalana Kagal
Published: 25 Jul 2025, Last Modified: 12 Oct 2025
COLM 2025 Workshop SoLaR Poster
Readers:
Everyone
TRUTH: Teaching LLMs to Rerank for Truth in Misinformation Detection
Hao Yu
,
Shenyang Huang
,
Zachary Yang
,
Maximilian Puelma Touzel
,
Kellin Pelrine
,
Jean-François Godbout
,
Reihaneh Rabbany
Published: 25 Jul 2025, Last Modified: 12 Oct 2025
COLM 2025 Workshop SoLaR Poster
Readers:
Everyone
Red Teaming Vision Language Models Under Change
Rebecca Tsekanovskiy
,
James Hendler
Published: 25 Jul 2025, Last Modified: 12 Oct 2025
COLM 2025 Workshop SoLaR Poster
Readers:
Everyone
Prompt Attacks Reveal Superficial Knowledge Removal in Unlearning Methods
Yeonwoo Jang
,
Shariqah Hossain
,
Ashwin Sreevatsa
,
Diogo Cruz
Published: 25 Jul 2025, Last Modified: 12 Oct 2025
COLM 2025 Workshop SoLaR Poster
Readers:
Everyone
Towards Attuned AI: Integrating Care Ethics in Large Language Model Development and Alignment
Rayane El Masri
,
Aaron J Snoswell
Published: 25 Jul 2025, Last Modified: 12 Oct 2025
COLM 2025 Workshop SoLaR Poster
Readers:
Everyone
Practical Evaluation of Machine Learning Efficiency Requires Model Life Cycle Assessment
Jared Fernandez
,
Clara Na
,
Yonatan Bisk
,
Constantine Samaras
,
Emma Strubell
Published: 25 Jul 2025, Last Modified: 12 Oct 2025
COLM 2025 Workshop SoLaR Poster
Readers:
Everyone
Neither Valid nor Reliable? Investigating the Use of LLMs as Judges
Khaoula Chehbouni
,
Mohammed Haddou
,
Jackie CK Cheung
,
Golnoosh Farnadi
Published: 25 Jul 2025, Last Modified: 12 Oct 2025
COLM 2025 Workshop SoLaR Poster
Readers:
Everyone
When Do Language Models Endorse Limitations on Universal Human Rights Principles?
Keenan Samway
,
Rada Mihalcea
,
Zhijing Jin
Published: 25 Jul 2025, Last Modified: 12 Oct 2025
COLM 2025 Workshop SoLaR Poster
Readers:
Everyone
CourtReasoner: Can LLM Agents Reason Like Judges?
Simeng Han
,
Yoshiki Takashima
,
Shannon Zejiang Shen
,
Chen Liu
,
Yixin Liu
,
Roque K. Thuo
,
Sonia Knowlton
,
Ruzica Piskac
,
Scott J Shapiro
,
Arman Cohan
Published: 25 Jul 2025, Last Modified: 12 Oct 2025
COLM 2025 Workshop SoLaR Poster
Readers:
Everyone
CONECUT: Scalable Removal of Preference Redundancy
Purbid bambroo
,
Daniel S. Brown
,
Ana Marasovic
Published: 25 Jul 2025, Last Modified: 12 Oct 2025
COLM 2025 Workshop SoLaR Poster
Readers:
Everyone
Privacy-Preserving LLM Interaction with Socratic Chain-of-Thought Reasoning and Homomorphically Encrypted Vector Databases
Yubeen Bae
,
Minchan Kim
,
Jaejin Lee
,
Sangbum Kim
,
Jaehyung Kim
,
Yejin Choi
,
Niloofar Mireshghallah
Published: 25 Jul 2025, Last Modified: 12 Oct 2025
COLM 2025 Workshop SoLaR Poster
Readers:
Everyone
Poor Alignment and Steerability of Large Language Models: Evidence Using 30,000 College Admissions Essays
Jinsook Lee
,
AJ Alvero
,
Thorsten Joachims
,
Rene F Kizilcec
Published: 25 Jul 2025, Last Modified: 12 Oct 2025
COLM 2025 Workshop SoLaR Poster
Readers:
Everyone
A Study of Large Language Models for Extraction of Themes from Homeless Shelter Case Notes
Madhumitha Selvaraj
,
Teale Masrani
,
Yani Ioannou
,
Geoffrey Messier
Published: 25 Jul 2025, Last Modified: 12 Oct 2025
COLM 2025 Workshop SoLaR Poster
Readers:
Everyone
The Alignment Game: The Inevitable Conflict of Values in Generative Models
Ali Falahati
,
Mohammad Mohammadi Amiri
,
Kate Larson
,
Lukasz Golab
Published: 25 Jul 2025, Last Modified: 12 Oct 2025
COLM 2025 Workshop SoLaR Poster
Readers:
Everyone
LLMs are Vulnerable to Malicious Prompts Disguised as Scientific Language
Yubin Ge
,
Neeraja Kirtane
,
Hao Peng
,
Dilek Hakkani-Tür
Published: 25 Jul 2025, Last Modified: 12 Oct 2025
COLM 2025 Workshop SoLaR Poster
Readers:
Everyone
MedPAIR: Measuring Physicians and AI Relevance Alignment in Medical Question Answering
Yuexing Hao
,
Kumail Alhamoud
,
Hyewon Jeong
,
Haoran Zhang
,
Isha Puri
,
Philip Torr
,
Mike Schaekermann
,
Ariel Dora Stern
,
Marzyeh Ghassemi
Published: 25 Jul 2025, Last Modified: 12 Oct 2025
COLM 2025 Workshop SoLaR Poster
Readers:
Everyone
Detecting Biased Language in Icelandic: A Named Entity Recognition Approach for Socially Responsible Text Analysis
Steinunn Rut Friðriksdóttir
,
Hafsteinn Einarsson
Published: 25 Jul 2025, Last Modified: 12 Oct 2025
COLM 2025 Workshop SoLaR Poster
Readers:
Everyone
Large Language Models in the Task of Automatic Validation of Text Classifier Predictions
Aleksandr Tsymbalov
Published: 25 Jul 2025, Last Modified: 12 Oct 2025
COLM 2025 Workshop SoLaR Poster
Readers:
Everyone
IMPersona: Evaluating Individual Level LM Impersonation
Quan Shi
,
Carlos E Jimenez
,
Stephen Dong
,
Brian Seo
,
Caden Yao
,
Adam Kelch
,
Karthik R Narasimhan
Published: 25 Jul 2025, Last Modified: 12 Oct 2025
COLM 2025 Workshop SoLaR Poster
Readers:
Everyone
MCP Safety Training: Learning to Refuse Falsely Benign MCP Exploits using Improved Preference Alignment
John Timothy Halloran
Published: 25 Jul 2025, Last Modified: 12 Oct 2025
COLM 2025 Workshop SoLaR Poster
Readers:
Everyone
Accidental Vulnerability: Factors in Fine-Tuning that Shift Model Safeguards
Punya Syon Pandey
,
Samuel Simko
,
Kellin Pelrine
,
Zhijing Jin
Published: 25 Jul 2025, Last Modified: 12 Oct 2025
COLM 2025 Workshop SoLaR Poster
Readers:
Everyone