Keywords: Approximate functional dependence; Bayesian network; Probabilistic graphical models
Abstract: As the general description of relationships between attributes, approximate functional dependencies (AFDs) almost hold for a given dataset with a few violations. Most of existing methods for AFD discover are insufficient to balance the efficiency and accuracy due to the massive search space and permission of violations. To address these issues, we propose an efficient method of probabilistic semantics guided discovery of AFDs based on Bayesian network (BN). Firstly, we learn a BN structure and conduct conditional independence tests on the learned structure rather than the entire search space, such that candidate AFDs could be obtained. Secondly, we fulfill search space reduction and structure pruning by making use of probabilistic semantics of graphical models in terms of BN. Consequently, we provide a branch-and-bound algorithm to discover the AFDs with the highest smoothed mutual information scores. Experimental results illustrate that our proposed method is more effective and efficient than the comparison methods. Our code is available at [https://github.com/DKE-Code/BNAFD](https://github.com/DKE-Code/BNAFD).
Supplementary Material: zip
Latex Source Code: zip
Code Link: https://github.com/DKE-Code/BNAFD
Signed PMLR Licence Agreement: pdf
Readers: auai.org/UAI/2025/Conference, auai.org/UAI/2025/Conference/Area_Chairs, auai.org/UAI/2025/Conference/Reviewers, auai.org/UAI/2025/Conference/Submission180/Authors, auai.org/UAI/2025/Conference/Submission180/Reproducibility_Reviewers
Submission Number: 180
Loading