Abstract: Highlights•Assess accuracy of ChatGPT to autonomously screen articles in systematic reviews.•General prompt template that can be parameterized for different systematic reviews.•Curated datasets of real systematic reviews in software engineering.•Empirical evaluation of different prompt strategies on the datasets.•Recommendations for next-generation systematic review tools relying on large language models.
Loading