Abstract: Features automatically extracted from images constitute a new and rich source of semantic knowledge that can complement information extracted from text. The convergence between visionand text-based information can be exploited in scenarios where the two modalities must be combined to solve a target task (e.g., generating verbal descriptions of images, or finding the right images to illustrate a story). However, the potential applications for integrated visual features go beyond mixed-media scenarios: Because of their complementary nature with respect to language, visual features might provide perceptually grounded semantic information that can be exploited in purely linguistic domains.
0 Replies
Loading