### Code Blocks
Absence.

### Filenames
Absence.

### Data Tables
Absence.

### CSV/JSON/JSONL Snippets
Absence.

### Metric Tables
Absence.

### Downloadable Artifacts
Absence.

### Extracted Specifics
**Entities and Systems**
*   **KataGo:** External engine used for move evaluation and providing reward signals.
*   **LoGo:** A model/method cited for its limitation regarding individual reasoning tokens.

**Human-Interpretable Strategic Concepts**
*   "thickness"
*   "influence"
*   "weak groups"

**Technical Mechanisms and Terminology**
*   **RL (Reinforcement Learning):** Applied with a reasoning LM to improve explanation quality.
*   **Gradient-based optimization:** Used to update internal representations to maximize rewards.
*   **Explanation-conditioned policy:** The policy from which rewards are maximized.
*   **Fine-grained reward models:** Provide feedback at sentence or sub-sentence levels.
*   **Concept grounding:** Alignment of internal features with meaningful strategic principles.
*   **Concept-based interpretability:** Research area concerning neuron specialization for atomic concepts.
*   **KL divergence penalty:** Used in RL training to maintain pretrained conceptual vocabulary.
*   **Explanation-to-move pipeline:** The process being evaluated for feedback.

**Cited Reference Sources**
*   arxiv
*   ibm+2
*   montrealethics
*   magazine.sebastianraschka+4
*   arxiv+3
*   cameronrwolfe.substack+1