This is supplemental material to reproduce the empirical results in
the paper "Improving Decision Trees through the Lens of Parameterized
Local Search" by Juha Harviainen, Frank Sommer, and Manuel Sorge, to
appear in NeurIPS 2025.

It should contain the following things:

- data/ - csv's of benchmark data taken from Penn Machine Learning
  Benchmarks, modified as described in the paper.
- results/weka-trees - precomputed pruned and unpruned trees.
- results/consolidated_results.csv - precomputed experimental results
  containing for each pruned and unpruned tree and each budget tuple
  the minimum number of errors achievable by doing exactly budget-many
  local-search or pruning operations.
- Various python, bash, and fish shell scripts in the root folder and
  subfolders, chiefly among them:
  - scriptToComputeWekaTrees.sh
  - run-local-search.fish
  - evaluate-data.fish

To reproduce the results from the precomputed data, run
./evaluate-data.fish (note that you need fish shell installed). This
will in particular produce the tables used in the paper:
- results/prune_unpruned_error_rates_table.tex
- results/prunable_datasets.tex
- results/tree_analysis_results_table_compact.tex
- results/dataset_statistics_table.tex
- results/error_reduction_table.tex
and the plots in plots/. Note that computing the plots needs
type1ec.sty and xelatex (supplied in Ubuntu by the packages cm-super
and texlive-xetex). Also note that this will in particular cause
eval-consolidate-data.py to run. This script will throw an error if
run-local-search.fish was not run before. This error can be ignored.

To recompute the unpruned and pruned trees, run
./scriptToComputeWekaTrees.sh. This will recompute all trees in
results/weka-trees. Requires a current installation of Python (tested
with 3.10.12) and Java (tested with openjdk 17.0.12).

To recompute the experimental results in
results/consolidated_results.csv, run ./run-local-search.fish. This
requires a current installation of Python (tested with 3.10.12), with
pandas and numpy. Note that calling evaluate-data.fish afterwards will
overwrite the precomputed experimental results contained in the zip
under results/consolidated_results.csv.

