SCANET: Improving multimodal representation and fusion with sparse- and cross-attention for multimodal sentiment analysis

Abstract: We propose a sparse- and cross-attention framework for multimodal sentiment analysis. First, we use sparse attention to improve the efficiency of representation learning. Then, we design an asymmetri...
0 Replies
Loading