Abstract: Unsupervised dependency parsing is a fundamental task in understanding syntactic dependency structures of natural language.
Previous parameter-free methods for probing the dependency structure recover a non-trivial amount of dependencies by assuming a correlation between the syntactic dependency (a word-to-word relation) and bi-lexical dependence scores (a metric measuring one word's influence on the other word).
However, these studies assume the correlation without verifying the existence of the correlation.
Furthermore, previous studies failed to utilize grammatical constraints that are beneficial to parsing performance in grammar-based unsupervised parsing methods.
In this paper, we investigate the correlation between the syntactic dependency and Conditional Mutual Information (CMI) scores, a bi-lexical statistical dependence metric.
We propose delta-energy, an unbiased estimate of the CMI score, and apply it to unsupervised dependency parsing.
We further assist the parsing model with three grammatical constraints.
We found the delta-energy score capable of effectively separating syntactic dependencies from non-dependencies.
Our unsupervised parsing model outperforms baseline parameter-free probing models in parsing performance, excelling in recovering semantically-related dependencies.
The ablation study shows that the three grammatical constraints contribute to the recovery of dependencies that are semantically related and that have strong Part-Of-Speech requirements.
Paper Type: long
Research Area: Syntax: Tagging, Chunking and Parsing / ML
Languages Studied: English
0 Replies
Loading