A Derivative-Based Membership Algorithm for Enhanced Regular Expressions

Mengxi Wang, Chunmei Dong, Weihao Su, Chengyao Peng, Haiming Chen

Published: 01 Jan 2024, Last Modified: 13 May 2025SETTA 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Enhanced regular expressions (EREs), which extend standard regular expressions with shuffle and counting operators, provide exponentially more succinct descriptions of regular languages. The membership problem, determining whether a given word w belongs to the language generated by an ERE E, is fundamental to numerous applications. However, efficient solutions for the membership problem of unconstrained EREs have remained elusive. This paper introduces a derivative for the counting operator and rigorously proves its correctness. We then leverage this derivative to design a membership algorithm for unconstrained EREs and analyze its time complexity based on a lemma establishing the relationship between the size of the derivative and the expression. We further propose algorithms based on the proposed derivatives to generate positive and negative words of specific lengths for EREs. The performance of the membership algorithm is then evaluated on real-world EREs. Finally, we validate the correctness of two existing inference algorithms that previously lacked formal correctness guarantees due to the absence of practical membership algorithms for unconstrained EREs.