A Benchmark of Discovering Drug-Target Interaction from Biomedical Literature

Yutai Hou; Yingce Xia; Lijun Wu; Shufang Xie; Yang Fan; Jinhua Zhu; Wanxiang Che; Tao Qin; Tie-Yan Liu

A Benchmark of Discovering Drug-Target Interaction from Biomedical Literature

Yutai Hou, Yingce Xia, Lijun Wu, Shufang Xie, Yang Fan, Jinhua Zhu, Wanxiang Che, Tao Qin, Tie-Yan Liu

08 Jun 2021 (modified: 24 May 2023)Submitted to NeurIPS 2021 Datasets and Benchmarks Track (Round 1)Readers: Everyone

Abstract: As millions of papers come out every year in the biomedical domain, automatic knowledge discovery (KD) from biomedical literature becomes an urgent demand in the industry. While KD in the biomedical domain attracts much research attention in recent years, the lack of benchmark datasets significantly hinders its progress. In this work, we create a dataset, KD-DTI, for discovering <drug, target, interaction> triplets from literature, which is one of the most important KD tasks in the biomedical domain. KD-DTI contains 14k unique biomedical papers, each of which is associated with at least one drug, target, interaction triplet. We also provide a semi-supervised dataset with 139k unique papers. We present and analyze multiple solutions, including several extractive/generative models and two data enhancement methods. The results show that the performance of those models is far from industry demand, indicating that the dataset presents a challenging research problem for the community. The dataset will be freely accessible after the review process.

Supplementary Material: zip

7 Replies

Loading