# Topological Active Inference (TAI) for Task Disambiguation

> Resolve ambiguous instructions by asking the right clarifying questions, guided by topological structure in the solution space.

![TAI Overview](main.jpg)

---

## Table of Contents

- [Introduction](#introduction)
- [Key Features](#key-features)
- [Method Overview](#method-overview)
- [Supported Tasks](#supported-tasks)
- [Installation](#installation)
- [Quick Start](#quick-start)
- [Configuration](#configuration)
- [Project Structure](#project-structure)
- [Output Files](#output-files)
---

## Introduction

When users provide vague or underspecified instructions, Large Language Models (LLMs) often guess rather than ask for clarification. This leads to outputs that may be technically correct but functionally misaligned with user intent.

**Topological Active Inference (TAI)** addresses this by:
1. Sampling multiple candidate solutions from the LLM
2. Discovering semantically distinct *intent clusters* using persistent homology
3. Selecting clarifying questions that efficiently bisect the probability mass across clusters

This approach reduces the number of interaction turns needed to converge on the user's true intent from **O(N)** to **O(log K)**, where K is the number of latent intents.

---

## Key Features

| Feature | Description |
|---------|-------------|
| **Topological Clustering** | Uses 0D persistent homology to separate semantic signal from syntactic noise |
| **TEIG Question Selection** | Maximizes Topological Expected Information Gain for efficient disambiguation |
| **Multiple Strategies** | Supports baseline, active-reasoning, and TAI variants |
| **Flexible Embeddings** | Works with OpenAI embeddings or lightweight TF-IDF |
| **Multiple LLMs** | Compatible with GPT-4o-mini, GPT-3.5-turbo, Llama-3 (8B/70B) |

---

## Method Overview

TAI operates in two phases:

### Phase I: Perception (Manifold Skeletonization)

```
Vague Instruction → LLM Sampler → Multiple Hypotheses → Semantic Embedding → Topological Filtering → Robust Intent Clusters
```

- **Sample** N candidate programs from the LLM's posterior
- **Embed** each hypothesis into a semantic space (OpenAI embeddings or TF-IDF)
- **Filter** using persistent homology: short-lived topological features = syntactic noise; long-lived features = true semantic intents

### Phase II: Action (Active Disambiguation Loop)

```
Intent Clusters → Generate Questions → Compute TEIG → Select Best Question → Get User Answer → Update Beliefs → Repeat
```

- **Synthesize** candidate clarifying questions
- **Evaluate** each question's Topological Expected Information Gain (TEIG)
- **Select** the question that best separates cluster probabilities
- **Update** belief distribution based on user/oracle answer
- **Stop** when one cluster dominates (P > 1 - δ)

---

## Supported Tasks

This repository implements TAI for two task domains from the Ambi-Bench suite:

### 1. Code Generation (Ambi-Code)

Disambiguate underspecified coding instructions by querying for expected I/O behavior.

| Dataset | Description | Tasks |
|---------|-------------|-------|
| HumanEval | Function completion with ambiguous specs | 49 tasks |
| APPS-Codewars | Competitive programming problems | 60 tasks |

**Example ambiguity:** "Write a sort function" → Ascending or descending? Stable sort? Handle duplicates?

### 2. Data Visualization (Ambi-Plot)

Disambiguate vague visualization requests by querying for chart type, library, and style preferences.

| Dataset | Description | Tasks |
|---------|-------------|-------|
| Ambi-Plot | Visualization preference disambiguation | 15 tasks |

**Example ambiguity:** "Visualize the sales data" → Line chart or bar chart? Use matplotlib or plotly? Color by region?

**Ground truth preferences include:**
- Chart type (line, bar, scatter, heatmap, pie, etc.)
- Library (matplotlib, seaborn, plotly)
- Color scheme (viridis, coolwarm, default)
- Style preferences (interactive, annotations, subplots)

---

## Installation

### 1. Clone the Repository

```bash
git clone https://github.com/your-username/active-task-disambiguation.git
cd active-task-disambiguation
```

### 2. Create Conda Environment

```bash
conda env create -f environment.yaml
conda activate active-reasoning
```

### 3. Configure API Credentials

Edit `src/utils.py` to add your API credentials:

```python
# For GPT models (Azure)
OPENAI_API_BASE = "https://your-endpoint.openai.azure.com/"
OPENAI_API_KEY = "your-api-key"
OPENAI_API_ENGINE = "your-deployment-name"

# For embeddings (required for TAI strategy)
OPENAI_EMBEDDING_API_BASE = "https://your-endpoint.openai.azure.com/"
OPENAI_EMBEDDING_API_KEY = "your-api-key"
OPENAI_EMBEDDING_API_ENGINE = "text-embedding-3-large"
```

**Alternative:** Set `OPENAI_API_KEY` as an environment variable for direct OpenAI API access.

---

## Quick Start

### Code Generation Tasks

**HumanEval example:**

```bash
python src/code-generation/active_code_generation.py \
    seed=0 \
    strategy=tai \
    task_id=HumanEval/1 \
    llm=gpt-4o-mini
```

**APPS example:**

```bash
python src/code-generation/active_code_generation.py \
    seed=0 \
    strategy=tai \
    task_id=1614 \
    llm=llama-3-8B \
    dataset_path=./data/code-generation/APPS_codewars.jsonl \
    save_dir=code-generation/APPS-codewars
```

### Data Visualization Tasks

**Ambi-Plot example:**

```bash
python src/data-visualization/active_visualization.py \
    seed=0 \
    strategy=tai \
    task_id=viz_001 \
    llm=gpt-4o-mini
```

### Run Batch Experiments

```bash
# Code Generation: HumanEval benchmark
bash run_human_eval.sh

# Code Generation: APPS-Codewars benchmark
bash run_apps.sh

# Data Visualization: Ambi-Plot benchmark
bash run_visualization.sh
```

---

## Configuration

All options can be set in `config/main_code_generation.yaml` or overridden via command line.

### Core Settings

| Parameter | Options | Description |
|-----------|---------|-------------|
| `strategy` | `baseline`, `baseline-binary`, `active-reasoning`, `active-reasoning-binary`, `tai`, `tai-binary` | Disambiguation strategy |
| `llm` | `gpt-4o-mini`, `gpt-3.5-turbo`, `llama-3-8B`, `llama-3-70B` | LLM for hypothesis generation |
| `total_hypothesis` | Integer (default: 10) | Number of candidate programs per iteration |
| `total_questions` | Integer (default: 5) | Number of candidate questions per iteration |
| `max_iter` | Integer (default: 4) | Maximum interaction rounds |

### TAI-Specific Settings

| Parameter | Options | Description |
|-----------|---------|-------------|
| `tai_embedding` | `openai`, `tfidf` | Embedding method for clustering |
| `tai_embedding_model` | String (default: `text-embedding-3-large`) | OpenAI embedding model name |
| `tai_tau` | Float or `null` | Persistence threshold (`null` = auto-select via max-gap heuristic) |

### Example: Switch to Baseline Strategy

```bash
python src/code-generation/active_code_generation.py \
    strategy=active-reasoning \
    llm=gpt-4o-mini \
    task_id=HumanEval/1
```

---

## Project Structure

```
.
├── config/
│   ├── main_code_generation.yaml    # Code generation config
│   └── main_visualization.yaml      # Data visualization config
├── data/
│   ├── code-generation/
│   │   ├── HumanEval_for_code_generation.jsonl
│   │   └── APPS_codewars.jsonl
│   └── data-visualization/
│       └── ambi_plot.jsonl          # Visualization tasks
├── src/
│   ├── utils.py                     # API clients, utilities
│   ├── code-generation/
│   │   ├── active_code_generation.py    # Main entrypoint
│   │   ├── reasoners.py                 # Baseline & TAI reasoners
│   │   ├── code_utils.py                # Code parsing helpers
│   │   └── _execution.py                # Program execution sandbox
│   └── data-visualization/
│       ├── active_visualization.py      # Main entrypoint
│       ├── reasoners.py                 # Baseline & TAI reasoners
│       └── viz_utils.py                 # Visualization helpers
├── run_human_eval.sh                # Batch script for HumanEval
├── run_apps.sh                      # Batch script for APPS
├── run_visualization.sh             # Batch script for Ambi-Plot
├── environment.yaml                 # Conda environment
└── README.md
```

---

## Output Files

Results are saved to `./results/{save_dir}/{task_id}/{strategy}/{llm}/iter_{seed}/`:

| File | Description |
|------|-------------|
| `config.yaml` | Run configuration snapshot |
| `requirements.json` | Accumulated (question, answer) pairs per iteration |
| `questions.json` | All candidate questions generated per iteration |
| `questions_selected.json` | The selected question per iteration |
| `listed_hypothesis.json` | Sampled program hypotheses per iteration |
| `eval_program_samples.json` | Generated programs during evaluation |
| `eval_program_correctness.json` | Pass/fail results on ground-truth tests |

**For visualization tasks:**

| File | Description |
|------|-------------|
| `eval_hypothesis.json` | Generated visualizations during evaluation |
| `eval_results.json` | Match rates against ground truth preferences |

---



