open-vocabulary
2020 – Present
vision and languange
2020 – Present
weakly-supervised object detection
2020 – Present
image-language modeling
2020 – Present
image-text representation
2020 – Present
vision language generation
2020 – Present
visual question answering
2020 – Present
benchmarking vision-language models
2020 – Present
video object segmentation
2017 – 2019