Selecting text classification model through maximizing posterior evidence over informative sub-space
Abstract: Text classification is a pivotal task in natural language understanding, and its performance has seen remarkable advancements with the rise of Pre-trained Language Models (PLMs). Recently, the proliferation of PLMs has made it increasingly challenging to choose the most suitable model for a given dataset. Since fine-tuning the sheer number of models is impractical, Transferability Estimation (TE) has become a promising solution to efficient model selection. Unlike current TE methods that focus solely on fixed and hard class assignments to evaluate the quality of model-encoded features, our approach further takes into account the inter-sample and inter-model variations represented by soft class assignments. We achieve this by utilizing class embeddings to predict posterior class assignments, with the logarithm of the maximum posterior evidence serving as the transferability score. Moreover, we found that the informative sub-space of the dataset can lead to more accurate calculation of soft class assignments, where we achieve efficient annotation of informative samples by eliciting the powerful judging ability of large language model. The resulting posterior evidence over the informative sub-space, LogIPE, enables us to capture subtle differences between models, enhancing the accuracy of model selection and validated by extensive experiments conducted on a wide range of text classification datasets as well as candidate PLMs.
External IDs:dblp:journals/fcsc/SunBCLRX25
Loading