Uma Metodologia para Tratamento do Viés da Maioria em Modelos de Stacking via Identificação de Documentos Difíceis

Welton Santos, Washington Cunha, Celso França, Guilherme Fonseca, Sérgio D. Canuto, Leonardo Rocha, Marcos André Gonçalves

Published: 2023, Last Modified: 15 Feb 2024SBBD 2023Readers: Everyone

Abstract: Stacking models are effective in automatic document classification by exploring model complementarity. Despite this, there are still situations of failure in the classification of some documents, named here as difficult documents, due to a bias in which most of the learned models point to a class different from the real one. This work presents a first proposal, consisting of two steps, aimed at overcoming failures due to majority bias. First, we train a difficult document detector. Next, we use the detector to direct difficult documents to a meta-classifier specialized in classifying such documents. Empirically, our approach shows promise in isolating the majority bias.

0 Replies