**Article 10**

**Data Governance and Management Practices**

The Adaptive Learning Outcome Analyzer (ALOA) has been developed following a comprehensive data governance framework tailored to the specific needs and intended purposes of the educational assessment context. Design choices prioritized multimodal data integration encompassing structured numerical assessment results and unstructured textual responses from students, enabling richer outcome analysis. Data collection originates from anonymized educational records sourced from partnered academic institutions, with explicit informed consent obtained for data use in system development. Importantly, all personal data processed were originally collected solely for educational purposes, consistent with GDPR mandates and aligned with the original purposes of the data (Article 10(2)(b)).

Data preparation entailed rigorous annotation and cleaning processes. Textual data were annotated using a combination of expert human annotators and semi-automated natural language processing pipelines to label relevant learning concepts, misconceptions, and engagement indicators. Numerical data underwent verification for consistency and completeness, with missing values addressed using imputation techniques validated against domain standards. Enrichment of data sets included augmenting assessment results with metadata such as subject, grade level, and contextual learning environment details (Article 10(2)(c)). Assumptions underlying the data sets were explicitly documented, presuming that assessment scores and text inputs reliably reflect learner knowledge and cognitive skills as validated via pedagogical research (Article 10(2)(d)).

Availability and quantity were assessed quantitatively: training data incorporated records from over 150,000 anonymized assessments spanning a diverse demographic across multiple EU regions. Validation and testing sets contained 30,000 and 25,000 assessment records respectively, stratified by learner age, socio-economic background, and subject area to ensure statistical sufficiency and representativeness (Article 10(2)(e)).

An extensive bias evaluation was conducted to identify potential sources of discriminatory effects or negative impacts on fundamental rights. This included algorithmic fairness audits focusing on sensitive attributes such as socio-economic status, regional language variations, and learning disabilities indirectly inferred from assessment patterns. Tools such as disparity impact analysis and adversarial testing were employed to detect imbalances affecting specific groups, particularly to avoid disadvantaging learners from minority backgrounds or with special educational needs (Article 10(2)(f)). Mitigation actions involved retraining and fine-tuning models with reweighted samples, as well as synthetic data augmentation targeting underrepresented cohorts. Furthermore, a feedback loop with educational stakeholders refined the approach to ensure continuous bias detection and correction (Article 10(2)(g)).

Where data gaps or shortcomings were identified—such as limited data availability in certain regional dialects or underrepresented subject areas—data augmentation strategies and targeted data collection campaigns were implemented. This included collaboration with additional educational institutions and linguistic experts to fill these gaps, accompanied by documentation of residual limitations and planned future enhancements (Article 10(2)(h)).

**Relevance, Representativeness, and Statistical Properties of Data Sets**

Training, validation, and testing data sets were curated to be jointly relevant and sufficiently representative of the learner populations and educational settings intended for system deployment. Measures ensured coverage across a wide range of curricula, age groups (6–22 years), and socio-cultural backgrounds, representing more than 20 distinct EU languages and dialects. Data quality assurance included cross-validation by independent domain experts to minimize errors and incompleteness, resulting in less than 1.5% error rates detected via automated consistency checks and spot audits (Article 10(3)).

Datasets reflected appropriate statistical properties such as balanced age and gender distributions, and aligned with pedagogical norms regarding expected performance distributions per grade. These properties were verified using standard statistical tests (e.g., Chi-square tests for independence and Kolmogorov-Smirnov tests for distributional similarity) to confirm data distribution alignment with intended populations. When necessary, composite datasets combining multiple sources were created to aggregate completeness and representativeness (Article 10(3)).

**Contextual and Geographical Specificity**

Data sets incorporate relevant geographical and contextual characteristics essential to accurate learning outcome analysis. Given the diversity of EU educational contexts, curricula variability and regional linguistic features are embedded within the data representation through standardized taxonomies and metadata tagging. Behavioral and functional settings such as in-person versus remote learning environments during data collection periods were also captured and modeled to reflect realistic instructional scenarios and inform system outputs accordingly (Article 10(4)).

**Processing of Special Categories of Personal Data for Bias Mitigation**

Although the system architecture and data management procedures were designed to avoid reliance on special categories of personal data (such as health status or racial origin), certain limited processing was conducted under strictly controlled conditions to detect and mitigate biases. For instance, anonymized and pseudonymized records indicating disabilities were processed solely to ensure equitable model performance. This processing adhered to strict safeguards: data access was restricted to authorized personnel under confidentiality agreements; data minimization principles were enforced; and processing incorporated state-of-the-art pseudonymisation and encryption technologies. All such processing was documented in detail and limited to the minimum scope necessary for bias correction. Upon completion of bias mitigation cycles, relevant records were promptly deleted according to retention policies (Article 10(5)(a–e)).

No special categories of personal data were transmitted or transferred beyond internal development environments, eliminating exposure risks. Access logs and audit trails were maintained to enforce accountability and traceability.

**Applicability of Data Governance to Non-Training Testing Sets**

As the system exclusively employs training-based transformer models, the provisions concerning testing data governance apply throughout the full cycle of training, validation, and testing datasets. This comprehensive coverage ensures quality and compliance across the entire model development lifecycle (Article 10(6)).