"decision": "Rewrite paper",
"reasoning": "A comprehensive review of all project files, including a full suite of experimental results (H1-H6), a paper draft, and detailed mentor feedback, has confirmed that the research and data collection phase is complete. The current paper draft, while empirically sound, has been identified as needing a 'MAJOR REVISION' due to critical weaknesses in its literature review, statistical rigor, and the scope of its claims. Subsequent external research has provided the exact resources needed to address these gaps: a curated list of 40+ relevant papers for the literature review and a clear methodology for implementing the required statistical tests (DeLong, Wilson intervals). The next logical and most impactful step is to execute this major rewrite, transforming the current draft and robust experimental results into a high-quality, publishable manuscript that meets the standards of a top-tier venue. All other options, such as further experimentation or creating a new idea, are inappropriate as the current research arc is complete and sound.",
"plan": "### Research Plan: Paper Rewrite and Finalization (v3)\n\nGoal: Perform a major revision of the paper consistency_confound_paper.tex to address mentor feedback, with a focus on strengthening the literature review, implementing statistical rigor, and refining the narrative.\n\n--- \n\nPriority 1: Foundational Strengthening - Literature & Statistics\n\n* Task 1: Expand Literature Review\n * Description: Rewrite the Introduction and Related Work sections to incorporate 15-20+ relevant citations, properly situating the paper within the broader field of jailbreak detection and uncertainty quantification.\n * Subtasks:\n 1. Integrate Mentor's Reading List: Systematically review and cite the key papers from mentor_docs/mentor_recommended_reads.json.\n 2. Map the Landscape: Use the o3_search results on detection methods to structure the Related Work section, citing examples for each major category (White-box, Multi-model, etc.).\n 3. Situate SE's Lineage: Explicitly connect SE to its origins in hallucination detection (SelfCheckGPT, Nature paper) and the broader field of uncertainty estimation, using the provided citations.\n * Reference Files: mentor_docs/mentor_recommended_reads.json, o3_search results, papers/latex_paper_templates/Agents4Science_Template 2/consistency_confound_paper.tex.\n\n* Task 2: Implement and Integrate Statistical Rigor\n * Description: Add new code to the analysis pipeline to compute confidence intervals and significance tests for all key results, and integrate these statistics into the paper's text, tables, and figures.\n * Subtasks:\n 1. Code Implementation: Create a new script, src/core/statistical_tests.py, to house the new statistical functions.\n 2. AUROC Analysis: Implement the paired DeLong test to compare the AUROC of SE vs. the best baseline for each experiment. This will be applied to the data in outputs/h*/evaluation/*.json.\n 3. FNR Analysis: Implement the Wilson score interval to compute 95% CIs for all reported FNR values.\n 4. Paper Integration: Update all tables (e.g., tables/table_2_fnr_comparison.md) and figures (visualisation/figures/*.png) to include error bars representing 95% CIs. In the text, replace statements like "SE underperforms" with statistically-grounded claims like "SE's AUROC was significantly lower than BERTScore's (paired DeLong test, p < 0.01)".\n * Reference Files: o3_search output on "statistical tests and confidence intervals", src/core/evaluation.py (to be updated), papers/latex_paper_templates/Agents4Science_Template 2/consistency_confound_paper.tex (to be updated).\n\n--- \n\nPriority 2: Methodological Improvements & Narrative Refinement\n\n* Task 3: Refine Scope and Claims\n * Description: Revise the Abstract, Introduction, and Conclusion to precisely narrow the paper's claims, focusing specifically on the failure of semantic inconsistency rather than all behavioral methods.\n * Implementation: Edit the text to replace phrases like "fundamental limitations of behavioral methods" with "fundamental limitations of output diversity as a safety signal for well-aligned models." The new thesis from the mentor note will be the guide.\n * Reference Files: papers/latex_paper_templates/Agents4Science_Template 2/consistency_confound_paper.tex, mentor_docs/mentor_feedback_post_checkpoint_2.md.\n\n* Task 4: Acknowledge and Address Limitations\n * Description: Expand the "Limitations" section of the paper to transparently address the key weaknesses identified by the mentor.\n * Subtasks:\n 1. Explicitly state that the FNR results are optimistic due to the lack of a separate calibration set, as noted in mentor_docs/paper_draft_0_mentor_review_claude.md.\n 2. Acknowledge the limited scope (2 primary models) and suggest that future work should test the "Consistency Confound" across a wider range of model architectures and sizes (referencing the unexecuted H7).\n * Reference Files: papers/latex_paper_templates/Agents4Science_Template 2/consistency_confound_paper.tex.\n\n--- \n\nPriority 3: Finalization and Submission Prep\n\n* Task 5: Complete Paper Checklists\n * Description: Fill out the "Agents4Science AI Involvement Checklist" and the "Agents4Science Paper Checklist" at the end of the LaTeX document.\n * Implementation: Replace all \involvmentTODO{} and \answerTODO{} macros with the appropriate answers and justifications based on the project's execution.\n * Reference Files: papers/latex_paper_templates/Agents4Science_Template 2/consistency_confound_paper.tex, claude_code_session_logs/.\n\n* Task 6: Final Polish and Submission\n * Description: Perform a full read-through of the revised manuscript to check for clarity, consistency, and grammatical errors, then prepare for submission.\n * Artifacts: Finalized consistency_confound_paper_v2.tex, ready for compilation and submission."