# LLM_Assisted_Compilation

**LLM_Assisted_Compilation** provides an AI-driven framework to automatically compile and validate C/C++ repositories at scale, either locally or in a distributed Kubernetes cluster.

## 🚀 Features
- **Automated Build Instruction Retrieval**: Heuristic and agentic methods to extract explicit build steps.
- **Parallelized Compilation**: Supports local multithreading and Kubernetes jobs for large-scale repository processing.
- **Flexible LLM Integration**: Compatible with OpenAI, Claude, and open-source models via vLLM.
- **End-to-End Validation Pipeline**: Extracts and verifies binary artifacts, ensuring reproducible builds.

## 📁 Repository Structure
```
Agents
│   ├── src/build_info_retrieval.py   # Heuristic retrieval of build instructions
│   ├── src/bash_executor.py          # Executes bash commands 
│   ├── src/prompts.py                # Prompt templates for AI 
│   └── src/agents.py                 # Orchestrates agent 
Validation
│   ├── src/validation_pipeline.py    # Validation workflow 
│   └── src/validation_helper_functions.py  # Supporting validation 
Auxiliary
│   ├── src/tools.py                  # Helper functions
│   ├── src/env_example.env           # Example environment variables
│   ├── src/clone_repos.py            # Repository cloning script
│   └── src/default_values.py         # Default argument values and 
Experiment_launcher
├── job-template.yaml                 # Kubernetes Job template
├── orchestrator.py                   # Launches Kubernetes jobs    
├── src/main.py                       # Local experiment entry point
└── src/worker_main.py                # Kubernetes experiment entry point
```

## ⚙️ Setup

1. **Prerequisites**  
   - Docker installed  
   - Python 3.10+  
   - GitHub token for higher clone rate limits  
   - API keys for chosen LLM provider (OpenAI, Claude, Tavily)

2. **Install dependencies**  
   ```bash
   conda create -n LLM_Compilation python=3.10
   conda activate LLM_Compilation
   pip install -r requirements.txt
   ```

3. **Configure environment**  
   Copy and edit:
   ```bash
   cp src/env_example.env .env
   ```
   Fill in your API keys and tokens.

4. **Build or pull Docker images**  
   - Local image: `sz904/compilation_base_image:latest`  
   - Kubernetes image: `sz904/compilation_stage2_image:latest`
   ```bash
   docker pull sz904/compilation_base_image:latest
   docker pull sz904/compilation_stage2_image:latest
   ```

## ▶️ Usage

### Local Debug
```bash
python src/main.py \
  --api_key=YOUR_API_KEY \
  --timeout_bash=12000 \
  --host_project_dir=$(pwd) \
  --random=-1 \
  --github_repo=https://github.com/openssl/openssl.git \
  --github_token=YOUR_GITHUB_TOKEN \
  --docker_image=sz904/compilation_base_image:latest
```

### Kubernetes Cluster
```bash
python orchestrator.py \
  --api_key=YOUR_API_KEY \
  --timeout_bash=12000 \
  --docker_image=sz904/compilation_stage2_image:latest \
  --host_project_dir=$(pwd) \
  --random=0 \
  --k8s_parallelism=25 \
  --github_token=YOUR_GITHUB_TOKEN
```

## 🤝 Contributing
Contributions are welcome! Please open issues or pull requests against this repository.

## 📄 License
TODO
<!-- This project is licensed under the [MIT License](LICENSE). -->
