Abstract: This paper proposes the concept and applications of the sectional and conditional functional dependency (SCFD), which is an important extension of the conditional functional dependency (CFD) and the functional dependency (FD). SCFDs describe relationship between parts of an attribute with other attributes, and they can be used as rules during data cleaning. Two algorithms named DBCFD and DKMP are designed for SCFD discovery. The DBCFD can find general SCFDs using the attributes in CFDs, while the DKMP can find SCFDs for the other attributes outside CFDs. The combination of DBCFD and DKMP is able to ensure the completeness of SCFDs. Meanwhile, we provide the SQL technique to clean data based on SCFDs. In experiment we evaluate the effectiveness and efficiency of the SCFDs using dataset generated by TPC-H, and the experiment results illustrate the effect of our algorithm on two kinds of real dataset.
0 Replies
Loading