Abstract: Highlights•A new dataset for text manipulation detection with diverse handcraft manipulations•An asymmetric dual-stream baseline framework to exploit different transformed domains•An aggregation hub and a fusion module for efficient multi-modal information•A contrastive learning module to enhance feature representation distinction•Comprehensive experiments and analysis on the dataset where we get state-of-the-art
Loading