# WBA-Evaluator

Class distribution skews in imbalanced datasets may lead to models with prediction bias towards majority classes, making fair assessment of classifiers a challenging task. Balanced Accuracy is a popular metric used to evaluate a classifier’sprediction performance under such scenarios. However, this metric falls short when classes vary in importance, especially when class importance is skewed differently from class cardinality distributions. In this paper, we propose a simple and general-purpose evaluation framework for imbalanced data classification thatis sensitive to arbitrary skews in class cardinalities and importances. Experiments with several state-of-the-art classifiers tested on real-world datasets and benchmarks from two different domains show that our new framework is more effective than Balanced Accuracy –- not only in evaluating and ranking model predictions, but also in training the models themselves.

# Running
The way to run WBA-Evaluator is as below:
```
python wba-evaluator.py <real_data_file / class_dist_file> <predicted_data_file / misclassify_file> <mode> {-u} <user_weights> {-v} {-c}
```
<br/>

real_data_file / class_dist_file : File with real data labels or, class distribution file for the dataset (if -c option is used) <br/>
predicted_data_file / misclassify_file : File with predicted data labels or, model misclassification file of the example by model (if -c option is used)  <br/>
mode : Weight calculation method for WBA. It can be either 0 for balanced accuracy (BA, even weights for all classes), 1 for user-defined weights, 2 for rarity weights, 3 for composite of rarity and user-defined weights, and 4 for composite weights (any n criteria). Default = 1. <br/>
-u : Option used for user defined weights for the classes. used with user_weights : File with weights of classes. <br/>
-v : Produce verbose output <br/>
-c : Use this option when computing by providing class distribution file and misclassification by model file <br/>

To use the composition mode with any n kinds of weights (i.e., mode 4), give multiple filenames containing the weight of classes.

See the examples/ folder and use the scripts/ subfolder to run the example datasets.
