ESOL (delaney) is a standard regression dataset containing structures and water solubility data for 1128 compounds. The dataset is widely used to validate machine learning models on estimating solubility directly from molecular structures (as encoded in SMILES strings).

The data file contains a csv table, in which columns below are used:
     "Compound ID" - Name of the compound
     "smiles" - SMILES representation of the molecular structure
     "measured log solubility in mols per litre" - Log-scale water solubility of the compound, used as label

Reference:
Delaney, John S. "ESOL: estimating aqueous solubility directly from molecular structure." Journal of chemical information and computer sciences 44.3 (2004): 1000-1005.
