Is Human-Like Text Liked by Humans? Multilingual Human Detection and Preference Against AI

Is Human-Like Text Liked by Humans? Multilingual Human Detection and Preference Against AI

ACL ARR 2025 July Submission97 Authors

22 Jul 2025 (modified: 30 Aug 2025)ACL ARR 2025 July SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Prior studies have shown that distinguishing text generated by large language models (LLMs) from human-written one is highly challenging, and often no better than random guessing. To verify the generalizability of this finding across languages and domains, we perform an extensive case study to identify the upper bound of human detection accuracy. Across 16 datasets covering 9 languages and 9 domains, 19 annotators achieved an average detection accuracy of 87.6%, thus challenging previous conclusions. We find that major gaps between human and machine text lie in concreteness, cultural nuances, and diversity. Prompting by explicitly explaining the distinctions in the prompts can partially bridge the gaps in over 50% of the cases. However, we also find that humans do not always prefer human-written text, particularly when they cannot clearly identify its source.

Paper Type: Long

Research Area: Resources and Evaluation

Research Area Keywords: human-oriented evaluation, multilingual MGT analysis, human preferences

Contribution Types: Data resources, Data analysis

Languages Studied: Arabic, Chinese, English, Hindi, Italian, Japanese, Kazakh, Russian, Vietnamese

Previous URL: https://openreview.net/forum?id=60EbrOffPP

Explanation Of Revisions PDF: pdf

Reassignment Request Area Chair: Yes, I want a different area chair for our submission

Reassignment Request Reviewers: Yes, I want a different set of reviewers

Justification For Not Keeping Action Editor Or Reviewers: We made edits of the paper and changed the track of the paper from Human-Centered NLP to Evaluation and Resources

A1 Limitations Section: This paper has a limitations section.

A2 Potential Risks: Yes

A2 Elaboration: Ethical Statement section in page 9

B Use Or Create Scientific Artifacts: Yes

B1 Cite Creators Of Artifacts: Yes

B1 Elaboration: 2, Reference

B2 Discuss The License For Artifacts: N/A

B3 Artifact Use Consistent With Intended Use: N/A

B4 Data Contains Personally Identifying Info Or Offensive Content: N/A

B5 Documentation Of Artifacts: Yes

B5 Elaboration: 2

B6 Statistics For Data: Yes

B6 Elaboration: 2

C Computational Experiments: No

C1 Model Size And Budget: N/A

C1 Elaboration: This is a human-oriented case study for machine-generated text detection, we just call APIs to generate data. No computational experiments using GPUs.

C2 Experimental Setup And Hyperparameters: Yes

C2 Elaboration: 2, 5

C3 Descriptive Statistics: Yes

C3 Elaboration: 3, 4, 5

C4 Parameters For Packages: N/A

C4 Elaboration: We did not these for processing.

D Human Subjects Including Annotators: Yes

D1 Instructions Given To Participants: Yes

D1 Elaboration: 2, Appendix B

D2 Recruitment And Payment: N/A

D2 Elaboration: Authors annotated all, we did not recruit annotators externally.

D3 Data Consent: Yes

D3 Elaboration: 1

D4 Ethics Review Board Approval: N/A

D5 Characteristics Of Annotators: Yes

D5 Elaboration: 2

E Ai Assistants In Research Or Writing: No

E1 Information About Use Of Ai Assistants: N/A

Author Submission Checklist: yes

Submission Number: 97

Loading