BELKA: The Big Encoded Library for Chemical Assessment

Published: 14 Aug 2024, Last Modified: 14 Aug 2024NeurIPS 2024 Competition TrackEveryoneRevisionsBibTeXCC BY-NC-SA 4.0
Keywords: drug discovery, chemistry, biology
TL;DR: We made 3.6B physical measurements of small molecules binding to target proteins in order to accelerate prediction in drug discovery.
Abstract: Small molecule drugs are often discovered using a brute force physical search, wherein scientists test for interactions between candidate drugs and their protein targets in a laboratory setting. As druglike chemical space is large (10^60), more efficient methods to search through this space are desirable. To enable the discovery and application of such methods, we generated the Big Encoded Library for Chemical Assessment (BELKA), roughly 3.6B physical binding measurements between 133M small molecules and 3 protein targets using DNA-encoded chemical library technology. We hope this dataset encourages the community to explore methods to represent small molecule chemistry and predict likely binders using chemical and protein target structure.
Submission Number: 12
Loading