LLM data curation, corpus development, fine-tuning datasets, instruction data, data augmentation
2023 – Present
LLM evaluation, benchmarking, multilingual assessment
2023 – Present
natural language processing, computational linguistics
2015 – Present
anonymization, personal health information, sensitive data detection
2018 – 2023
medical text processing, clinical NLP
2016 – 2023
hate speech, sentiment analysis, topic classification
2015 – 2018