Enhancing Text Authenticity: A Novel Hybrid Approach for AI-Generated Text Detection

Published: 02 Jun 2024, Last Modified: 12 Sept 2024OpenReview Archive Direct UploadEveryoneCC BY-ND 4.0
Abstract: The rapid advancement of Large Language Models (LLMs) has ushered in an era where AI-generated text is increasingly indistinguishable from human-generated content. Detecting AI-generated text has become imperative to combat misinformation, ensure content authenticity, and safeguard against malicious uses of AI. We introduce an innovative mixed methodology that integrates conventional TF-IDF strategies with sophisticated machine learning algorithms, including Bayesian classifiers, Stochastic Gradient Descent (SGD), Categorical Gradient Boosting (CatBoost), and 12 instances of Deberta-v3-large models. Our method tackles the difficulties of identifying AI-produced text by combining the advantages of conventional feature extraction techniques with the latest advancements in deep learning models. Through extensive experiments on a comprehensive dataset, we demonstrate the effectiveness of our proposed method in accurately distinguishing between human and AI-generated text. Our approach achieves superior performance compared to existing methods. This research contributes to the advancement of AI-generated text detection techniques and lays the foundation for developing robust solutions to mitigate the challenges posed by AI-generated content.
Loading