\section*{Responsible AI Statement}

\textbf{Use case and scope.} This work studies active evaluation and acquisition policies on \emph{synthetic} relational data. The system is a research artifact and is not deployed in safety‑critical settings.

\textbf{Data, privacy, and consent.} We use procedurally generated data (see \texttt{data/metadata.json}); no personal or sensitive information is included. There are no subject‑level consent considerations.

\textbf{Fairness and bias.} Because the dataset is synthetic, group fairness constructs are not applicable. Nonetheless, we report aggregate metrics (AUROC, AUPRC, Hit@10) and recommend reporting confidence intervals across seeds to characterize performance variability.

\textbf{Safety, misuse, and dual use.} The methods could be repurposed for data collection prioritization. Potential risks include over‑confidence when uncertainty is miscalibrated. We partially mitigate this by (i) calibrating acquisition via the standard‑deviation scale and a sweep over $\lambda$, (ii) adding reliability diagrams/ECE to assess calibration, and (iii) documenting limitations. We do \emph{not} release any models or code intended for biomedical or high‑risk domains.

\textbf{Reproducibility and transparency.} We provide pinned dependencies, a one‑command \texttt{make reproduce} workflow, and deterministic hashing to ensure identical results across runs with the same seed. We also document the exact commands in the README.

\textbf{Environmental impact.} Experiments run on CPU in minutes; estimated energy usage is negligible relative to typical ML training. For larger studies we recommend tracking energy via external tooling and preferring seed‑sweeps over repeated large retrains.

\textbf{Alignment with the NeurIPS Code of Ethics.} Our study respects privacy (no real user data), prioritizes safety (non‑deployment, risk analysis), encourages openness (released code and data), and acknowledges limitations and failure modes.
