Optimizing Sentiment Inference with Multi-Expert Models via Real-Time GPU Resource Monitoring

Junqin Huang, Chuishuo Meng, Sen Wang, Feng Liang

Published: 2024, Last Modified: 22 Jul 2025SmartIoT 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: This research investigates the use of a multi-expert model system for sentiment analysis, incorporating four distinct models: DistilBERT, DeBERTa, logistic regression, and SVM. This system effectively balances sentiment classification accuracy with GPU utilization. Through the implementation of a real-time GPU resource monitoring and evaluation system, we can precisely monitor memory usage and computational load on the GPU. This enables dynamic adjustments in model scheduling to optimize GPU resource allocation. Additionally, we have developed a comprehensive evaluation framework to assess the effectiveness and efficiency of various scheduling algorithms, with the goal of identifying the optimal model combination based on inference time and GPU computational resources. Experimental results show that this multi-expert model system effectively balances accuracy and computational resources while performing sentiment analysis on the Yelp dataset. This study offers valuable insights for deploying efficient NLP applications.