\section{Conclusions}\label{sec:conclusion}

In this work we have proposed methods to improve and evaluate the
adversarial robustness of tree-based ensemble models in the context of
regression tasks. We introduce a novel method to construct splits that
are robust to adversarial perturbations in the context of the XGBoost
algorithm. Our method is based on an analytical solution to the
upper-bound of the Taylor approximation of the loss function typically
used within XGBoost, that can be computed in constant time. This
enables us to construct robust splits while maintaining the
computational efficiency of the original algorithm. Our formulation is
generalisable to any differentiable loss function and can thus be
extended to various use-cases, including regression.

Furthermore, we proposed a series of novel metrics to quantify the
robustness of regression models and evaluate the robustness of the
XGBoost algorithm. Our results show that the models are highly
sensitive to adversarial perturbations in the input space, which leads
to significant performance degradation. Extensive experiments
highlight that our proposed robust XGBoost algorithm derives models
that are more robust to 
 perturbations in the input space. We additionally
observe a trade-off between robustness and predictive
performance in several experiments.
%We hypothesize that this is likely
%due to the lack of hyper-parameter tuning for the robust models.

{\bf{Limitations and Future Work:}} There are some limitations in the
proposed method that provide several promising directions for future work:

\begin{itemize}
    \item The procedure considers a simplification of the robust 
    loss function, by considering the worst-case loss per split, 
    thereby constructing  robust trees in isolation. Perturbations
    in the previous trees in the ensemble are not currently considered.
    A future work in this direction is to consider the robust loss 
    over the entire ensemble, and build a procedure that certifiably 
    minimises the robust loss over the complete ensemble. 

    \item The current work primarily focusses on $L_\infty$ norm 
    adversarial attacks on numerical features, which limits its applicability
    on mixed and categorical data which are common in tabular datasets.
    As the proposed linear relaxation approach is highly general 
    (and only requires the creation of an ambiguity set), it is 
    extensible to other types of data and adversarial attacks. This 
    presents a promising direction for future work to explore a framework
    for creating ambiguity sets for categorical and mixed data types,
    as well as for other types of adversarial attacks.
\end{itemize}

% Several promising directions for 
% future research emerge from this
% work. One key avenue is the development of provably robust training
% methodologies for general loss functions within the XGBoost framework,
% extending existing classification-focused approaches to optimize
% robust loss across the entire ensemble. Additionally, investigating
% the impact of adversarial perturbations on discrete features, during
% both training and evaluation presents another valuable research
% direction.
