Our preprocessing pipeline is set up to be as general as possible and
allows for custom implementations, defined as subclass from the
`Preprocessor` class and passed as a command-line argument. For our
tasks, we have defined a default preprocessing pipeline for both
classification and regression tasks. The snippet below shows the class structure of the
default classification preprocessor. In the private methods of this
class, is used to apply feature generation steps. The abstract
`Preprocessor` has two functions that need to be implemented:
`__init__()` (which initializes the preprocessor and configures the
settings) and `apply(data)` (which returns the preprocessed data
dictionary of features and labels for each of the train, validate, and
test splits)

``` {#code:preprocessing-definition .python frame="single" style="pycharm" language="Python" caption="\\textit{Example preprocessing pipeline structure.} See \\url{https://github.com/anonymized-user/YAIB/blob/development/icu_benchmarks/data/preprocessor.py} for the full file." label="code:preprocessing-definition" columns="fullflexible" basicstyle="\\ttfamily\\tiny"}
@gin.configurable("base_classification_preprocessor")
class DefaultClassificationPreprocessor(Preprocessor):
    def __init__(self, generate_features: bool = True, scaling: bool = True, use_static_features: bool = True):
        """
        Args:
            generate_features: Generate features for dynamic data.
            scaling: Scaling of dynamic and static data.
            use_static_features: Use static features.
        Returns:
            Preprocessed data.
        """


    def apply(self, data, vars):
        """
        Args:
            data: Train, validation and test data dictionary. Further divided in static, dynamic, and outcome.
            vars: Variables for static, dynamic, outcome.
        Returns:
            Preprocessed data.
        """
        ...
        return data

    def _process_static(self, data, vars):
        ...
        return data

    def _process_dynamic(self, data, vars):
        ...
        return data

    def _dynamic_feature_generation(self, data, dynamic_vars):
        ...
        return data
```