Abstract: Highlights•We propose a new model comparison method without an unaffordable large-scale subjective annotation experiment.•A new similarity function named NGSM is proposed as a semantic distance measure to model discrepancy of captions. With this NGSM, the informative images can be selected effectively from an arbitrary large-scale raw image dataset.•We demonstrate quantitative results of the generalization ability of the competing ICMs and provide detailed analysis about the key factor of improving the generalization ability of ICMs.
Loading