Abstract: Recently, many outstanding techniques for Time series forecasting (TSF) have been proposed. These techniques depend on necessary and sufficient data samples, which is the key to train a good predictor. Thus, an Active learning (AL) algorithmic framework based on Support vector regression (SVR) is designed for TSF, with the goal to choose the most valuable samples and reduce the complexity of the training set. To evaluate the quality of samples comprehensively, multiple essential criteria, such as informativeness, representativeness and diversity, are considered in a two clustering-based consecutive stages procedure. In addition, considering the imbalance of time series data, a range of values might be seriously under-represented but extremely important to the user. Thus, it is unreasonable to assign the same prediction cost to each sample. To address this imbalance problem, a multiple criteria cost-sensitive active learning algorithm in the virtue of weight SVR architecture, abbreviated as MAW-SVR, ad hoc for imbalanced TSF, is proposed. By introducing the cost-sensitive scheme, each sample is endowed with a penalty weight, which can be dynamically updated in the AL procedure. The experimental comparisons between MAW-SVR and the other six AL algorithms on a total of thirty time series datasets verify the effectiveness of the proposed algorithm.
Loading