DistillSeq: A Framework for Safety Alignment Testing in Large Language Models using Knowledge Distillation

Mingke Yang, Yuqi Chen, Yi Liu, Ling Shi

Published: 11 Sept 2024, Last Modified: 06 Jan 2026CrossrefEveryoneRevisionsCC BY-SA 4.0
Loading