Encoding Ontologies with Holographic Reduced Representations for Transformers

22 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: representation learning for computer vision, audio, language, and other modalities
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Transformers, Ontology, Holographic Reduced Representations, Deep Learning
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
TL;DR: Holographic Reduced Representations can be used to encode medical ontologies into transformer models to create improved representations of medical concepts and improve pretraining and finetuning performance.
Abstract: The ability to encode meaningful structure into deep learning models opens up the potential for incorporating prior knowledge, particularly in fields where domain-specific information is of great importance. However, transformer models trained on NLP tasks with medical data often have randomly initialized embeddings that are then adjusted based on training data. For terms appearing infrequently in the dataset, there is less opportunity to improve these representations and learn semantic similarity with other concepts. Medical ontologies already represent many of the biomedical concepts and define a relationship structure between these concepts, making ontologies a valuable source of domain-specific information. One of the ongoing challenges of deep learning is finding methods to incorporate this domain knowledge into models. Holographic Reduced Representations (HRR) are capable of encoding ontological structure by composing atomic vectors to create structured higher-level concept vectors. Deep learning models can further process these structured vectors without needing to learn the ontology from training data. We developed an embedding layer that generates concept vectors for clinical diagnostic codes by applying HRR operations that compose atomic vectors based on the SNOMED CT ontology. This approach still allows for learning to update the atomic vectors while maintaining structure in the concept vectors. We trained a Bidirectional Encoder Representations from Transformers (BERT) transformer model to process sequences of clinical diagnostic codes and used the resulting HRR concept vectors as the embedding matrix for the model. The model was first pre-trained on a masked-language modeling (MLM) task before being fine-tuned for mortality and disease prediction tasks. The HRR-based approach improved performance on the pre-training and fine tuning tasks compared to standard transformer embeddings. This is the first time HRRs have been used to produce structured embeddings for transformer models and we find that this approach maintains semantic similarity between medically related concept vectors and allows better representations to be learned for rare codes in the dataset, as rare codes are composed of elements that are shared with more common codes.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
Supplementary Material: zip
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 6263
Loading