Abstract: Highlights•The first 100k tactile dataset with multi-granular descriptions.•A well-designed process for building multimodal datasets.•A method that efficiently captures the links between touch, language, and vision.•Top performance in tactile-related tasks across various benchmarks and settings.
Loading