Construction of Countably Infinite Programs That Evade Malware/Non-Malware Classification for Any Given Formal System

Vasiliki Liagkou, Panagiotis E. Nastou, Paul G. Spirakis, Yannis C. Stamatiou

Published: 2025, Last Modified: 09 May 2025Cryptogr. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: The formal study of computer malware was initiated in the seminal work of Fred Cohen in the mid-80s, who applied elements of Computation Theory in the investigation of the theoretical limits of using the Turing Machine formal model of computation in detecting viruses. Cohen gave a simple but realistic formal definition of the characteristic actions of a computer virus as a Turing Machine that replicates itself and proved that detecting this behaviour, in general, is an undecidable problem. In this paper, we complement Cohen’s approach by providing a simple generalization of his definition of a computer virus so as to model any type of malware behaviour and showing that the malware/non-malware classification problem is, again, undecidable. Most importantly, beyond Cohen’s work, our work provides a generic theoretical framework for studying anti-malware applications and identifying, at an early stage, before their deployment, several of their inherent vulnerabilities which may lead to the construction of zero-day exploits and malware strains with stealth properties. To this end, we show that for any given formal system, which can be seen as an anti-malware formal model, there are infinitely many, effectively constructible programs for which no proof can be produced by the formal system that they are either malware or non-malware programs. Moreover, infinitely many of these programs are, indeed, malware programs which evade the detection powers of the given formal system.