# Custom Evaluation Function via Benchmark Class

## How to Use

To create a benchmark for a new dataset, follow these steps:

1. Create a new Python file, e.g., `my_dataset_benchmark.py`
2. Import the base class:
   ```python
   from benchmarks.benchmark import BaseBenchmark
   ```
3. Create a new class that inherits from `BaseBenchmark`:
   ```python
   class MyDatasetBenchmark(BaseBenchmark):
       def __init__(self, name: str, file_path: str, log_path: str):
           super().__init__(name, file_path, log_path)
   ```
4. Implement the required abstract methods:
   - `evaluate_problem`: Evaluate a single problem
   - `calculate_score`: Calculate the score for a prediction
   - `get_result_columns`: Define column names for the results CSV file

5. Override other methods as needed, such as `load_data` or `save_results_to_csv`

## Example

Refer to the `DROPBenchmark` class in the `drop.py` file for an example of how to implement a benchmark for a specific dataset. 

By following these guidelines, you can easily create custom benchmark evaluations for new datasets.
