Abstract: Activity characterization, optimization, and generation of small molecules are increasingly active areas of research at the intersection of molecular chemistry and machine learning. Large datasets of small molecules have allowed training deep models that have been shown capable of exploring the underlying chemical space and generating valid, novel, and unique molecules. While this is a noteworthy achievement, what impedes operationalizing these models in the wet laboratory is the ability to link the chemical and biological space of small molecules. A central challenge to this is the lack of activity data on these entities. In this paper we relate a computational pipeline that permits linking the chemical and biological space of an important class of small molecules, quaternary ammonium compounds (QACs). Our experimental collaborators have characterized the activity of many QACs against Staphylococcus aureus. We train various generative models and evaluate their ability to generate valid, novel, and unique QACs. We then leverage classification models trained over activity data to evaluate the generated QACs. The resulting pipeline identifies valid, novel, unique, membrane-active QACs. This work opens the way to further avenues of research in machine learning models capable of jointly sampling the chemical and biological space of small molecules.
0 Replies
Loading