Combining Conditional Random Fields and first-order logic for modeling hidden content structure in sentiment analysis

Published: 2013, Last Modified: 07 Aug 2024ICNC 2013EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: The paper develops a connection between the first-order logic representation and the content structure model in sentiment analysis applications. We propose a modified semi-supervised approach to study the word-level content structure with well-designed first-order logic features. The word-level content structure is the Conditional Random Fields (CRF) with latent word-level topic nodes. Introducing first-order logic features into our model can solve the long-distance dependency problem. The new approach is applied to two multi-aspect sentiment analysis tasks: the multi-aspect sentence labeling task and the multi-aspect rating prediction task. We use the data from Amazon corpus and movie-review corpus. We compare our method with other three hidden nodes graphical models, i.e. the Latent Dirichlet Allocation (LDA), the Hidden-Unit CRF (HUCRF), and the Content Structure using CRF (CSCRF, which is considered as our sentence-level baseline). Experimental results demonstrate that our method outperforms the sentence-level baseline by 2.1% of the F1 measure in the multi-aspect sentence labeling task, and by 2.1% of the Accuracy in the rating prediction task. Our method outperforms other two methods at most by 16.6% and 10.3% separately in the multi-aspect sentence labeling task and the rating prediction task. By using 3000 unlabeled documents, our method improves the F1-measure in the multi-aspect sentence labeling task by 8.2%, and improves the Accuracy in the rating prediction task by 3.0%, using 400 unlabeled reviews.
Loading