GDC Project (A-GDC-SAC minimal reproduction)

Overview
- This repository provides a modular, minimal-yet-practical reproduction of the A-GDC-SAC algorithm described in the paper draft. It includes:
  - Clean SAC baseline (continuous control)
  - Geometry module for Hessian–Vector Products (HVP), simple Lanczos routine, and geometric risk kappa
  - A-GDC-SAC agent that reweights the Bellman target via a sigma computed from kappa
  - Simple custom environments (OptimalTrap, DynamicOptimalTrap) for mechanism validation
  - Config system (YAML), training script, logging utilities

Status
- End-to-end training and evaluation implemented.
- Geometry supports action/state/joint curvature, power-iteration (default) or Lanczos; configurable weights.
- Baselines included: SAC, Lagrangian SAC (PCPO-like), Lagrangian SAC + KL (FOCOPS-like), and A-GDC-SAC.

Quick Start
1) Create a virtual environment and install requirements:
   - python -m venv .venv && source .venv/bin/activate
   - pip install -r requirements.txt

2) Train A-GDC-SAC on OptimalTrap:
   - python -m gdc_project.scripts.train \
       --config gdc_project/configs/algos/a_gdc_sac.yaml

3) Train SAC baseline:
   - python -m gdc_project.scripts.train \
       --config gdc_project/configs/algos/sac.yaml

Notes
- 默认环境 `OptimalTrap-v0` 无第三方依赖；如安装了 Safety-Gymnasium，可将 `env_id` 设为其任务。
- 几何风险在 (s', a'_π) 上通过目标评论家评估；σ 不参与反传。
- 大 batch 或高维动作时，优先使用 `curvature_method: power` 与较小的 `lanczos_steps`。
4) Train PCPO/FOCOPS approximations:
   - python -m gdc_project.scripts.train --config gdc_project/configs/algos/pcpo.yaml
   - python -m gdc_project.scripts.train --config gdc_project/configs/algos/focops.yaml

5) Batch experiments and Pareto plot:
   - python -m gdc_project.scripts.run_experiments --agents sac pcpo focops a_gdc_sac --seeds 0 1 2
   - python -m gdc_project.scripts.plot_results --runs runs/gdc --out results.png
