Abstract: Highlights•We combine innovative commonsense knowledge with channel and region feature.•We develop a channel attention mechanism that ensures pure semantic features.•We verify that the soft router mechanism is an effective fusion method for TCCTN.•We achieve good results on the MS-COCO and verify the contextual knowledge of TCCTN.
Loading