Abstract: Unexpected toxicity poses a significant impediment to successful entry of drug candidates into the market. For drug toxicity evaluation, deep learning techniques have exhibited robust competitiveness compared to costly and ethically challenging animal studies. However, recent modeling primarily relies on deep neural networks with limited attention to species with small sample sizes and lacking interpretability. In response to these concerns, we proposed an innovative variant algorithm based on random forest, termed the Multi-Task Iterative Error-Correcting Random Forest (MTIEC-RF), to predict multi-species acute toxicity using both single-view and multi-view approaches. In the single-view context, MTIEC-RF utilized the multi-task random forest as its backbone structure. Through iterative processes, it generated a set of error-correcting decision trees based on challenging samples associated with diverse toxicity endpoints. In the multi-view scenarios, we integrated reliable multi-view data, a consensus framework, and endpoint representations. The final MTIEC-RF model is with better generalization ability on medium- and small-sized toxicity endpoints, and significantly outperformed state-of-the-art deep learning techniques by achieving 14% performance improvement on the average R2 of 59 datasets. Additionally, MTIEC-RF identified pivotal descriptors contributing to acute toxicity prediction by quantifying feature importance, number of hydrogen atoms emerged as a significant influential factor in the prediction process.
Loading