This work was completed during my internship at the company. Due to intellectual property (IP) restrictions, we are currently only able to provide a randomly selected sample of 25K data points out of the 2.1 million we constructed. We will release the full dataset and models after the review process.

The data is structured in JSON format:
- `"KC1"`: key concept 1  
- `"KC2"`: key concept 2  
- `"question"`: a question related to KC1 and KC2  
- `"solution"`: a detailed solution to the question

