Keywords: Social computing, Social network texts, Diachronic word embedding, Semantic change
Verify Author List: I have double-checked the author list and understand that additions and removals will not be allowed after the submission deadline.
Abstract: Humans employ words to convey abstract concepts. The evolution of lexical semantics holds significance not only in Natural Language Processing applications but also in the realm of social computing research. However, the scarcity of diachronic word representations persists due to the substantial computational demands, particularly evident in the absence of large-scale and enduring diachronic word embeddings for social network texts. Herein, we introduce RedditEM, a comprehensive collection of diachronic word representations derived from Reddit English comment texts, featuring one word embedding per month spanning from January 2010 to December 2021. To assess the diachronic semantic shifts of words, we employ cosine distance metrics and juxtapose the embeddings' neighborhoods. Our experimental findings underscore the utility of RedditEM in detecting alterations in word meanings within social networks and advancing social computing endeavors. Researchers interested in accessing this resource are cordially invited to contact us without hesitation.
A Signed Permission To Publish Form In Pdf: pdf
Primary Area: Applications (bioinformatics, biomedical informatics, climate science, collaborative filtering, computer vision, healthcare, human activity recognition, information retrieval, natural language processing, social networks, etc.)
Paper Checklist Guidelines: I certify that all co-authors of this work have read and commit to adhering to the guidelines in Call for Papers.
Student Author: Yes
Submission Number: 285
Loading