Development Of A Search Engine For The Medline Database With Search Results Ranking From The Perspective Of Evidence-Based MedicineDownload PDF

12 May 2023OpenReview Archive Direct UploadReaders: Everyone
Abstract: Medical research quality and reliability level assessment presents a serious problem. The object of this study is the documents containing article abstracts from the MEDLINE database. The aim is to develop a new algorithm for ranking medical research, based on evidence levels. Methods: The method was developed for automatic markup of abstract training set by evidence levels and subtypes of medical interventions. On the basis of Synthetic Minority Over-sampling Technique and Latent Dirichlet Allocation method the problem associated with unbalanced training set was solved. For further classification by evidence levels, such algorithms as Multinomial Logistic Regression and Linear SVM were used. In addition, ensembles of Random Forest, Gradient Boosting Machine, and nonlinear SVM algorithms were trained for further evaluation and selection of a more efficient method. At the final stage the search index was formed and search engine prototype was developed. Results: Training was performed on 2,000,000 abstracts from the MEDLINE database for 2006-2013. Some papers were marked by levels of evidence and by subtypes of medical interventions for training the classifier. A high clas- sification result accuracy was achieved for the available data. For instance, for the “randomized double blind” class precision was 0.93, recall – 0.75, and F-measure – 0.82. The developed approach also yielded high classification results for such a hard-to detect class as “nonrandomized single blind studies”: precision was 0.92, recall – 0.75, and F-measure was 0.83. It was shown that decomposition of the evidence levels improves results by balancing training sets and choosing the best classification algorithm for each subtask separately. ConClusions: The developed search engine is based on a combination of classifiers that determine the evidence level and subtype of medical intervention for the abstract. Results are sorted in descending order of relevance of an abstract to the query. Implemented search engine is being tested by medical experts.
0 Replies

Loading