**Article 15**

**Accuracy and Performance Characteristics Throughout the Lifecycle**  
Insight Proctor Analytics employs transformer-based Vision Language Models (VLMs) trained on a dataset comprising approximately 1.2 million annotated video frames from simulated and real exam session environments collected under varying lighting conditions and camera angles. Initial validation benchmarks demonstrate a gesture detection accuracy of 92% under controlled conditions across short (up to 30 minutes) exam durations. The evaluation used standard metrics including precision, recall, and F1-score averaged over multiple gesture categories relevant to suspicious behavior detection. Tests also included the use of established public datasets for action recognition to calibrate baseline performance. However, longitudinal internal studies over extended exam sessions (up to 3 hours) reveal a degradation in detection accuracy that can decline gradually by roughly 10 percentage points, attributed primarily to variable environmental factors such as lighting changes and camera repositioning. This decline is documented in performance reports included in the technical dossier; no automated recalibration or alerting mechanisms are currently integrated into the system to address this drift during deployment.

**Measurement Methodologies and Benchmarking Practices**  
To quantify accuracy and robustness, Meridian participated in collaborative benchmarking initiatives facilitated by European metrology and AI benchmarking bodies during the system’s development phase. These initiatives promoted the use of scenario-based testing with synthetic datasets emulating real-world proctoring conditions, evaluating both gesture recognition robustness and resilience to environmental variability. The system’s assessment methodology incorporates frame-level confidence analysis and temporal smoothing to mitigate transient misclassifications. Nonetheless, lifecycle monitoring for model drift relies on periodic offline performance reviews rather than continuous automated assessment. While the Commission’s guidelines encourage evolving benchmarks, the current version of Insight Proctor Analytics aligns with established industrial standards circa 2024 but does not incorporate lifecycle drift detection or correction capabilities beyond scheduled retraining cycles.

**Declared Accuracy Metrics and Instructions for Use**  
The instructions for use explicitly state that Insight Proctor Analytics achieves an initial average detection accuracy of 92% for suspicious gestures under standard test conditions, and that performance may vary in environments with fluctuating lighting and camera positioning. Users are informed that longer exam durations may experience a reduction in detection robustness due to environmental variability and model drift. The documentation advises that system operators implement manual session monitoring practices and recommends reinitialization or recalibration between exam sessions to maintain optimal accuracy. These declarations serve to inform deploying entities about the system’s expected performance boundaries without guaranteeing uninterrupted consistency over extended use in dynamic operational conditions.

**Resilience and Robustness Measures**  
The system architecture incorporates multiple technical measures intended to enhance robustness against internal faults and environmental variability. Preprocessing pipelines include adaptive histogram equalization and geometric normalization to reduce the impact of lighting and camera angle changes on input video frames. The VLM leverages attention mechanisms across visual and textual modalities to increase context-aware recognition, improving resistance to false positives from semantically irrelevant motions. However, despite these mitigations, the system currently lacks built-in mechanisms for automatic recalibration or real-time robustness monitoring during deployments, which limits resilience to gradual accuracy decay. No redundancy solutions such as fail-safe fallback models or backup monitoring modules are embedded. Additionally, Insight Proctor Analytics does not perform continuous learning once deployed, minimizing the risk of bias amplification or feedback loops, but also precluding automatic adaptation to evolving environmental factors or user behavior changes during active operation.

**Cybersecurity and Protection Against Manipulation**  
Insight Proctor Analytics is engineered with a cybersecurity framework aligned with relevant threats for high-risk AI systems deployed in networked proctoring environments. It includes secure data encryption protocols for video inputs and communications, role-based access control, and tamper-evident logging to monitor access and system changes. Protections against adversarial inputs exploit preprocessing filters designed to detect and reject frames with anomalous noise patterns or image perturbations that could trigger misclassification. The system’s training pipeline incorporates adversarial training with a curated set of perturbation techniques to improve model resilience against evasion attacks. Nonetheless, post-deployment mechanisms to detect sophisticated data poisoning or model poisoning attacks are not implemented, reflecting the current industrial practice for such AI systems but acknowledged as a potential area for future enhancement. No automatic systems are present to identify or respond to attempts to degrade model performance through manipulated inputs during live usage.