RRAIF:Clinical Trial Language Model with LLM based on Ranking Responses

Sunstella 2023 Summer Research Camp Submission13 Authors

15 Jun 2023 (modified: 22 Jun 2023)Sunstella 2023 Summer Research Camp SubmissionEveryoneRevisions
Keywords: clinical trial, large language model, ranking loss, rationale
TL;DR: Through ranking loss and self-correction, the output of the large model can be directly fed back to the local model.
Abstract: As AI systems become more capable, we would like to enlist their help to supervise other AIs. In this study, we propose a methodology to finetune our local language model via the feedback of the large language model. Our approach involves three main modules: Supervised Fine-tuning, RRAIF and Instruction Tuning with Rationale. These three modules are on a pipeline and cycle back and forth to finetune our local language model. We evaluate our model on Alpaca dataset to validate the performance of our model.
Submission Number: 13