Abstract: Highlights•We pioneer a CSLR method via intra- and inter-video correlation learning.•IAM directs gloss features to visual features through a cross-attention matrix.•IEM interactive optimisation of inter-video features.
Loading
OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2026 OpenReview