Multi-Objective Symbolic Regression for Data-Driven Scoring System Management

Published: 01 Jan 2022, Last Modified: 18 Aug 2024ICDM 2022EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Scores are mathematical combinations of elementary indicators (EIs) widely used to measure complex phenomena. Upon the theoretical framework definition, score construction requires a method to aggregate EIs. Aggregation is usually chosen among known methodologies fixing its shape through a try and error approach. Only then are the predictive power, the distribution of the index, and its ability to stratify the population measured. In this paper, we propose a novel data-driven approach that generates analytic aggregation methods relying on multi-objective symbolic regression. We translate the properties that the index must exhibit into optimization goals so that optimal index candidates replicate target variables, data balancing, and stratification. We run experiments on real data sets to solve three main score management problems: data-driven score simplification, generation, and combination. The results obtained show the effectiveness and robustness of the proposed approach.
Loading