A Hierarchical Entity-Based Approach to Structuralize User Generated Content in Social Media: A Case of Yahoo! Answers
Abstract: Social media like forumsand microblogshave accumulated a huge amount of user generated content (UGC) containing human knowledge. Currently, most of UGC is listed as a whole or in pre-defined categories. This “list-based” approach is simple, but hinders users from browsing and learning knowledge of certain topics effectively. To address this problem,we propose a hierarchical entity-based approach for structuralizing UGC in social media. By usinga large-scaleentityrepository,wedesign a three-step framework to organize UGC in a novel hierarchical structure called “cluster entity tree (CET)”. With Yahoo! Answers as a test case, we conduct experiments and the results show the effectiveness of our framework in constructingCET. We furtherevaluate the performance of CET on UGC organization in both user and system aspects. From a user aspect, our user study demonstrates that, with CET-based structure, users perform significantlybetter in knowledgelearningthan using traditional list-based approach. From a system aspect, CET substantially boosts the performance of two information retrieval models (i.e., vector space model and query likelihood language model).
0 Replies
Loading