Note that the data is subsampled from the original full HealthBench benchmark.
But still you are not allowed to use the other questions (or answers) from HealthBench for training!