On the Feasibility of Deep Reinforcement Learning for Modeling Delay-Based PUFs

Mieszko Ferens, Edlira Dushku, Sokol Kosta

Published: 01 Jan 2024, Last Modified: 19 May 2025WiMob 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Modeling of Physical Unclonable Functions (PUFs) through Machine Learning (ML) algorithms has been widely applied to break their security. Currently, many different algorithms are capable of modeling a wide range of delay-based PUFs, preventing these primitives from being effectively applied to security applications such as authentication. To tackle this, other studies have developed PUF designs that prevent ML modeling attacks. However, these studies typically focus on defending against well-known Supervised Learning techniques, including Logistic Regression or Multi-Layer Perceptron. On the other hand, since ML is rapidly evolving, new techniques can be potentially applied to model the latest proposed PUFs. For this reason, in this paper, we study the applicability of a subfield of ML, namely Deep Reinforcement Learning (DRL), for PUF modeling and comparing it to the state-of-the-art. We find that DRL, specifically the Deep Q-Network (DQN) algorithm, can be as effective as state-of-the-art modeling attacks on XOR Arbiter PUF, making it a new threat to delay-based PUFs. Additionally, when considering PUFs with challenge obfuscation, such as the Interpose PUF, DQN outperforms the state-of-the-art, raising concerns about the long-term security of obfuscation techniques.