UNMASKING SYNTHETIC LANGUAGE: ADVANCES IN DEEPFAKE TEXT DETECTION AND EVALUATION
DOI:
https://doi.org/10.63878/cjssr.v3i3.1322Abstract
The rapid proliferation of large language models (LLMs) like ChatGPT has revolutionized text generation, enabling the creation of highly fluent and contextually relevant content that rivals human writ- ing. However, this capability also poses substantial risks, including the spread of disinformation, fake news amplification, social manipulation, and phishing schemes. While detection methods for deepfake images and videos have advanced significantly, identifying synthetic text remains a nascent field, plagued by issues such as poor robustness, limited generalization across domains, and vulnerability to adversarial modifications. Even human evaluators often fare little better than chance in distinguishing AI-generated from human-authored text. This study addresses these gaps through a comprehensive benchmarking of popular transformer-based models BERT-base-uncased, RoBERTa-base, ALBERT-base-v2, and DistilBERT on three diverse datasets: Tweep- Fake (short social media posts), TuringBench (multi-domain and multi-generator benchmarks), and the Hu- man ChatGPT Comparison Corpus (HC3) in English. By conducting dataset-specific evaluations, we provide insights into how model architecture, size, and design impact detection accuracy, efficiency, and adaptabil- ity. Our key contributions include a balanced comparison of lightweight and larger models for deepfake text classification, cross-dataset analysis to highlight generalization strengths and weaknesses, and practical recom- mendations for deploying these detectors in real-world scenarios. Results demonstrate that RoBERTa generally outperforms others in accuracy, while lighter models like DistilBERT offer trade-offs in speed and resource use, underscoring the need for hybrid approaches to enhance robustness.Downloads
Published
2025-09-30
Issue
Section
Articles
License
Copyright (c) 2025 Contemporary Journal of Social Science Review

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
How to Cite
UNMASKING SYNTHETIC LANGUAGE: ADVANCES IN DEEPFAKE TEXT DETECTION AND EVALUATION. (2025). Contemporary Journal of Social Science Review, 3(3), 2843-2861. https://doi.org/10.63878/cjssr.v3i3.1322