Masking domain-specific information for cross-domain deception detection

Javier Sánchez-Junquera, Luis Villaseñor Pineda, Manuel Montes-y-Gómez, Paolo Rosso, Efstathios Stamatatos

2020 (modified: 29 Oct 2021)Pattern Recognit. Lett. 2020Readers: Everyone

Abstract: Highlights • Introduction of a domain adaptation approach for deception detection in texts. • The use of information from both source and target domains to obtain a suitable text representation. • An effective masking technique that transforms domain-specific terms to a more abstract form. • Competitive results in cross-domain deception detection are reported using benchmark datasets. Abstract The facilities provided by social media and computer-mediated communication make easy the dissemination of deceptive behavior, after which different entities or people could be affected. The deception detection by supervised learning has been widely studied; however, the scenario in which there is one domain of interest and the labeled data is in another domain has received poor attention. This paper presents, to our knowledge, the first domain adaptation approach for cross-domain deception detection in texts. Our proposal consists in modifying original texts from the source and target domains in a form in which common content and style information is maintained, but domain-specific information is masked. In order to adequately select domain-specific terms to be masked, the proposed method uses unlabeled instances from both domains. Our experiments demonstrate that the masking technique is a good idea for detecting deception in cross-domain scenarios; and the performance could be further improved if unlabeled information from the target domain is considered.

0 Replies