%Motivations and simple description
We also present our PDF annotation tool made with Streamlit, a Python library to easily create web apps. We used it to verify our detection process and therefore we designed it to fulfil two needs: having multiple users annotate the same project easily and being able to handle a large number of PDF files.  While it was used to annotate datasets' presence in scientific papers, it can be extended to any PDF annotation task.

%Labeling features
Once the software is installed on a local server, a user can create an annotation project by uploading the PDFs and choosing up to two initial sets of labels. In our study, the first set of labels corresponds to the list of datasets to detect and the second set is the list of locations a mention could be classified into (E.g. Abstract, Introduction, Method). While the second set of labels is fixed, the first one is not and new values can be added at any point during the labelling.

%Group features
We also wanted to ease the annotation by multiple users. At the creation of the project, the owner can upload a file containing the division of the papers into different groups. This way, users can find the papers they were assigned to by selecting the right group on the annotation page.
Finally, when the annotations are downloaded from the server, a file per person is obtained allowing more data processing afterwards.