Modality-Aware Dual-Stream Biometric Authentication Using ConvNeXt-Tiny and TinyViT With Attention-Based Feature Fusion

Muhammad Zaheer Sajid, Muhammad Fareed Hamid, Ayman Youssef, Jarrar Amjad, Nour Aburaed

Published: 01 Jan 2026, Last Modified: 28 Feb 2026IEEE AccessEveryoneRevisionsCC BY-SA 4.0

Abstract: Biometric authentication is increasingly used as a secure and convenient alternative to traditional identity verification. Yet, systems that rely on a single trait such as face or fingerprint often fail in practical settings due to noise, spoofing, and variability across users and sensors. Existing multimodal approaches address some of these issues but typically assume the input modality in advance, depend on pretrained models, or rely on simple fusion strategies that do not adapt to data quality. In this work, we introduce a modality-aware dual-stream authentication framework that combines a lightweight CNN modality classifier with dedicated feature extractors, ConvNeXt-Tiny for face recognition and TinyViT for fingerprint recognition. The framework adaptively integrates the two modalities through an attention-based fusion layer, allowing the system to emphasize the more reliable input under varying conditions. Unlike most prior studies, all components are trained from scratch, ensuring robustness without relying on transfer learning. Experiments on the LFW face dataset and PolyU HRF fingerprint dataset show strong results, with accuracies of 98.94% for face recognition, 98.15% for fingerprint recognition, and 99.45% for multimodal verification. The attention-based fusion outperforms score-level and concatenation methods in both accuracy and equal error rate. These findings demonstrate that lightweight, modality-aware fusion can deliver secure and flexible biometric authentication suitable for deployment on mobile, edge, and high-security platforms.

External IDs:doi:10.1109/access.2026.3656107