Abstract: Representation learning is the field of modern machine learning focused on automatically producing useful representations from data. This thesis studies the impact of data volumes, annotation perspectives, and structural properties on learnable representations for textual emotion detection and structural graph encodings. On emotion detection, we investigate how different text representations and model choices affect performance on two large emotion datasets. We assess the impact of different annotation approaches by studying differences between how writers express and readers perceive emotions. On graph representations, we propose two encodings of ego-network subgraphs and analyze their theoretical properties. Our encodings can act as input features or leveraged during learning, boosting the theoretical expressivity of message-passing and subgraph neural network architectures. On several large experimental benchmarks, we find they also improve the predictive performance and efficiency of popular graph models. Our work deepens the practical understanding of learnable representations on both domains.
Loading