Abstract: Breast cancer has become one of the most common and deadly cancers in the world, and its treatment has been the focus of research. In the search for breast cancer drug candidate compounds, it is important to establish an effective quantitative structure-activity relationship for drug research and development. Neural networks have achieved high accuracy in this field, but with shortcomings of a large number of parameters, high model complexity, and poor interpretability. Therefore, a Serial Fuzzy System built by Subtractive clustering and ANFIS (SFSSA) layer by layer is proposed to explore a solution with better interpretability. Through the experiment in the bioactivity data set of candidate compounds with several models, the following conclusions are found: 1) The precision of SFSSA is better than that of classic linear regression; 2) SFSSA has fewer parameters and rules, and has better interpretability and generalization ability than classic neural network algorithms; 3) SFSSA has less training time and higher prediction accuracy than optimized TSK fuzzy system algorithm MBGD-RDA (Minibatch Gradient Descent with Regularization, DropRule, and AdaBound); 4) SFSSA’s subsystem with 15 inputs achieved best prediction effect. In short, SFSSA provides a new way to apply fuzzy systems for high-dimensional regression problems.
Loading