Exploring LLMs' Ability to Spontaneously and Conditionally Modify Moral Expressions through Text Manipulation

Candida Maria Greco, Lucio La Cava, Lorenzo Zangari, Andrea Tagarelli

Published: 2025, Last Modified: 31 Jul 2025ACL (1) 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Morality serves as the foundation of societal structure, guiding legal systems, shaping cultural values, and influencing individual self-perception. With the rise and pervasiveness of generative AI tools, and particularly Large Language Models (LLMs), concerns arise regarding how these tools capture and potentially alter moral dimensions through machine-generated text manipulation. Based on the Moral Foundation Theory, our work investigates this topic by analyzing the behavior of 12 LLMs among the most widely used Open and uncensored (i.e., ”abliterated”) models, and leveraging human-annotated datasets used in moral-related analysis. Results have shown varying levels of alteration of moral expressions depending on the type of text modification task and moral-related conditioning prompt.

External IDs:dblp:conf/acl/GrecoCZT25