Abstract: Highlights•Propose a new neural network architecture for efficient multi-modal ai computing.•Design two lightweight modules for low execution latency while keeping accuracy.•Evaluate on real-world multi-modal tasks and reduce up to 44.5% execution latency.•Apply parallel computing to multi-modal models and compare the performance.
Loading