BioCLIP: A Vision Foundation Model for the Tree of Life

19 Sept 2023 (modified: 25 Mar 2024)ICLR 2024 Conference Withdrawn SubmissionEveryoneRevisionsBibTeX
Keywords: computer vision, evolutionary biology, biology, clip, domain-specific pretraining
TL;DR: We gather a 10M image dataset and pre-train a CLIP model for use in evolutionary biology tasks.
Abstract: Images of the natural world, collected by a variety of cameras from drones to individual phones, are increasingly abundant sources of biological information. There is an explosion of computational methods and tools, particularly computer vision, for extracting biologically relevant information from images for science and conservation. Yet, currently, most of these are bespoke approaches designed for a specific task and are not easily adaptable or extendable to new questions, contexts, and datasets. We develop the first large-scale multimodal model, BioCLIP, as a foundation for general organismal biology questions on images. We leverage the unique properties of biology as the the application domain for computer vision, namely the abundance and variety of images about plants, animals, and fungi, together with the availability of rich structured biological knowledge. We curate and release TreeOfLife-10M (the largest and most diverse available dataset of biology images), train BioCLIP, rigorously benchmark our approach on diverse fine-grained biology classification tasks, and find that BioCLIP consistently and substantially outperforms existing baselines (by 17% to 20% absolute). Intrinsic evaluation further reveals that BioCLIP has learned a hierarchical representation conforming to the tree of life, shedding light on its strong generalizability.
Primary Area: applications to physical sciences (physics, chemistry, biology, etc.)
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 1969
Loading