Abstract: Highlights•We reformulate RGB-Event pattern recognition as a vision-language fusion task for optimal modality connection.•We propose SAFE, a Semantic-Aware Frame-Event fusion framework for pattern recognition using a large pre-trained model.•Our SAFE is proven effective on PokerEvent and HARDVS datasets.
Loading