{"url":"https://www.perplexity.ai/search/49934aa4-b969-4956-895f-78e502ee8e35","title":"Perplexity","description":"Perplexity is a free AI-powered answer engine that provides accurate, trusted, and real-time answers to any question.","authors":[],"published_date":null,"domain":"perplexity.ai","is_paywall":false,"is_cached":false,"content":"### Code Blocks\nAbsence.\n\n### Filenames\nAbsence.\n\n### Data Tables\nAbsence.\n\n### CSV/JSON/JSONL Snippets\nAbsence.\n\n### Metric Tables\nAbsence.\n\n### Downloadable Artifacts\nAbsence.\n\n### Extracted Specifics\n**Entities and Systems**\n*   **KataGo:** External engine used for move evaluation and providing reward signals.\n*   **LoGo:** A model/method cited for its limitation regarding individual reasoning tokens.\n\n**Human-Interpretable Strategic Concepts**\n*   \"thickness\"\n*   \"influence\"\n*   \"weak groups\"\n\n**Technical Mechanisms and Terminology**\n*   **RL (Reinforcement Learning):** Applied with a reasoning LM to improve explanation quality.\n*   **Gradient-based optimization:** Used to update internal representations to maximize rewards.\n*   **Explanation-conditioned policy:** The policy from which rewards are maximized.\n*   **Fine-grained reward models:** Provide feedback at sentence or sub-sentence levels.\n*   **Concept grounding:** Alignment of internal features with meaningful strategic principles.\n*   **Concept-based interpretability:** Research area concerning neuron specialization for atomic concepts.\n*   **KL divergence penalty:** Used in RL training to maintain pretrained conceptual vocabulary.\n*   **Explanation-to-move pipeline:** The process being evaluated for feedback.\n\n**Cited Reference Sources**\n*   arxiv\n*   ibm+2\n*   montrealethics\n*   magazine.sebastianraschka+4\n*   arxiv+3\n*   cameronrwolfe.substack+1","error":null,"saved_to":"/home/user/.pplx/search/fetch/c1126142.json"}
