Verifiable Boosted Tree Ensembles

Published: 01 Jan 2025, Last Modified: 12 Aug 2025SP 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Verifiable learning advocates for training machine learning models amenable to efficient security verification. Prior research demonstrated that a specific class of decision tree ensembles - called large-spread ensembles - allow for robustness verification in polynomial time against any norm-based attacker. This study expands prior work on verifiable learning from basic ensemble methods based on hard majority voting to state-of-the-art boosted tree ensembles, such as those trained using XGBoost or LightGBM. Our formal results indicate that robustness verification is achievable in polynomial time for large-spread boosted ensembles when considering attackers based on the $L_{\infty}-\mathbf{norm}$, but remains NP-hard for other norm-based attackers. Nevertheless, we present a pseudo-polynomial time algorithm to verify robustness against attackers based on the $L_{p}-\mathbf{norm}$ for any $p\in \mathbb{N}\cup\{0\}$, which in practice grants excellent performance and enables verification methods outperforming the state of the art in terms of analysis times. Our experimental evaluation on public datasets shows that large-spread boosted ensembles are accurate enough for practical adoption, while being amenable to efficient security verification. Moreover, our techniques scale to challenging security datasets and their associated security properties proposed in prior work.
Loading