Abstract: Next-Generation Sequencing has revolutionized the study of genetic mutations, enabling large-scale investigations into their roles in disease development. However, extracting meaningful insights from the vast amount of biomedical literature remains a complex challenge that cannot be addressed manually. In this paper, we present pre-trained models (PTMs) for the automatic extraction of relations from biomedical text, specifically targeting the variant-phenotype domain. Our evaluation on the SNPPhenA corpus demonstrates that fine-tuning small BERT-based models, particularly DeBERTa, yields strong performance, approaching the current state-of-the-art (SOTA). Additionally, our results indicate that carefully fine-tuning Google’s Gemini Pro 1.0 outperforms the existing SOTA for both sentence-level tasks (where the model processes only the target sentence) and abstract-level tasks (where the model processes the entire abstract).
Loading