## Dataset information

We used questions from the [LMSYS Chat 1M dataset](https://huggingface.co/datasets/lmsys/lmsys-chat-1m) for our text similarity experiment in section 4.2.