Toggle navigation
OpenReview
.net
Login
×
Go to
DBLP
homepage
Multimodal summarization with modality features alignment and features filtering
Binghao Tang
,
Boda Lin
,
Zheng Chang
,
Si Li
Published: 01 Jan 2024, Last Modified: 31 Jul 2025
Neurocomputing 2024
Everyone
Revisions
BibTeX
CC BY-SA 4.0
Abstract:
Highlights•Maximum Mean Discrepancy to align the textual and visual modalities.•Using CLIP to extract visual features and a filter to enhance utilization.•Feasibility of Large Language Model for data preprocessing.
Loading