**Article 14**

**Design and Development of Human-Machine Interface for Oversight**

Insight Proctor Analytics is architected with a streamlined human-machine interface that primarily presents detection outcomes—namely, flags of suspected exam misconduct—to proctors through a centralized dashboard. The interface is optimized for real-time alerts and a concise overview of identified incidents, employing a binary flagging mechanism (flagged/not flagged) without ancillary indicators of confidence levels or uncertainty metrics. No explicit visual markers or probabilistic scores accompany the alerts. The design rationale prioritized operational simplicity and minimizing additional cognitive load on proctors during live exam monitoring sessions, based on internal user feedback collected from pilot deployments involving approximately 65 academic institutions between 2022 and 2024. The system does not incorporate embedded prompts or warnings to inform proctors about potential AI errors such as false positives (incorrectly flagged behavior) or false negatives (missed misconduct). As a result, proctors receive the AI-generated flags as definitive indicators of irregular behavior without visual cues on the underlying model’s reliability or limitations.

**Human Oversight Objectives and Risk Minimization**

The system is intended to assist proctors in identifying potentially dishonest behaviors that may undermine exam integrity. To this end, Insight Proctor Analytics supports proctors by continuously scanning video streams and test metadata using advanced transformer-based Vision Language Models trained on a combined annotated dataset of over 120,000 labeled exam video segments and exam-context pairs. These models have undergone validation yielding a precision rate of 85% and recall of 78% on benchmark internal datasets mimicking realistic exam scenarios. While the provider developed the model and conducted thorough performance assessments, the system does not explicitly integrate mechanisms to mitigate the risks associated with over-reliance on the AI output or to contextualize outputs with uncertainty quantification. Therefore, the deployment presumes that proctors will exercise their own judgment unassisted by AI-generated confidence indicators or interactive explanations.

**Proportionate Oversight Measures Integrated in the AI System**

As a high-risk AI system, Insight Proctor Analytics includes core components—transformer-based VLMs, real-time video ingestion pipelines, and semantic contextual analyzers—structured for robustness in handling various lighting conditions, camera angles, and test formats. Pre-market evaluations included adversarial robustness testing against synthetic image perturbations and simulated occlusions, demonstrating system stability with degradation in anomaly detection rates capped at 5% under these conditions. However, compliance-driven oversight measures such as embedded uncertainty indicators, graduated alert levels, or explanatory model outputs that could facilitate nuanced human interpretation were not incorporated prior to market release. Technical feasibility assessments did not find insurmountable obstacles in integrating such features, but design decisions favored minimizing interface complexity over enhanced interpretability. Documentation provided to deployers includes instructions for basic system status monitoring but does not specify or mandate additional human-in-the-loop validation processes or alert management tactics.

**Provider-Recommended Human Oversight Practices and Capabilities for Deployer Implementation**

The system is delivered with user manuals and limited onboarding materials describing the AI's general function and intended behaviors to deployers. These materials delineate how to operate the dashboard, initiate system start/stop commands, and perform manual overrides by disregarding AI flags at proctor discretion. While a physical stop button to suspend system analysis is implemented, it is primarily intended for technical interruption rather than user-initiated oversight intervention based on AI output assessment. The documentation advices proctors to rely on their own vigilance but does not elaborate on vigilance strategies against automation bias or unintended overreliance. No training modules or in-dashboard prompts are provided to raise awareness surrounding AI limitations, error modes, or the probabilistic character of outputs. The system does not generate post-event interpretive reports that could aid proctors in reviewing flagged incidents with uncertainty context.

Furthermore, explanations supporting correct interpretation of flagged behaviors are limited to static textual descriptions of detection rules, lacking dynamic interpretability tools that might illuminate the rationale behind specific AI decisions. Records of processing activities comply with applicable EU regulations and document the necessity of data use for misconduct detection purposes; however, these logs do not include annotations relevant to oversight facilitation or error analysis. Consequently, the human oversight assigned to proctors depends principally on their unaided capacity to monitor AI alerts and integrate observational judgment without engineered safeguards or explicit risk communication embedded within the system.