# FedPOB: Federated Prompt Optimization via Bandits

This repository contains the implementation of two federated learning approaches for automatic prompt engineering using multi-armed bandit algorithms:

- **FedPOB**: Combines LinUCB (Linear Upper Confidence Bound) with federated learning to optimize prompts across multiple agents collaboratively.
- **FedPOB-Pref**: Extends FedPOB with preference-based optimization using dueling bandits, where agents learn from pairwise comparisons between prompts rather than absolute scores.

## Installation

```bash
pip install -r requirements.txt
```

Set API keys:
```bash
export OPENAI_API_KEY="your_openai_key"
export OPENROUTER_API_KEY="your_openrouter_key"
```

## Usage

```bash
cd main/Induction
python ./experiment/FedPOB.py --task boolean_expressions --n_domain 500 --agents 10 --total_iter 50
python ./experiment/FedPOB-Pref.py --task boolean_expressions --n_domain 500 --agents 10 --total_iter 50
```

### Key Parameters

- `--task`: Task name (default: 'boolean_expressions')
- `--n_domain`: Number of candidate prompts (default: 500)
- `--agents`: Number of federated agents (default: 6)
- `--total_iter`: Total number of iterations (default: 100)
- `--gpt`: Language model to use (default: 'gpt-3.5-turbo')
- `--nu`: UCB confidence parameter (default: 0.3)
- `--lamdba`: Regularization parameter (default: 1.0)

## Supported Tasks

### Instruction Induction Tasks (29 tasks)
**Base tasks (23):**
- `active_to_passive`, `antonyms`, `common_concept`, `diff`, `first_word_letter`
- `informal_to_formal`, `larger_animal`, `letters_list`, `negation`, `num_to_verbal`
- `orthography_starts_with`, `rhymes`, `second_word_letter`, `sentence_similarity`, `sentiment`
- `singular_to_plural`, `sum`, `synonyms`, `taxonomy_animal`, `translation_en-de`
- `translation_en-es`, `translation_en-fr`, `word_in_context`

**Additional tasks (6):**
- `auto_categorization`, `object_counting`, `odd_one_out`
- `periodic_elements`, `word_sorting`, `word_unscrambling`

### Big-Bench Hard Tasks (24 tasks)
- `boolean_expressions`, `date_understanding`, `disambiguation_qa`, `dyck_languages`
- `formal_fallacies`, `geometric_shapes`, `hyperbaton`
- `logical_deduction_five_objects`, `logical_deduction_seven_objects`, `logical_deduction_three_objects`
- `movie_recommendation`, `multistep_arithmetic_two`, `navigate`, `penguins_in_a_table`
- `reasoning_about_colored_objects`, `ruin_names`, `salient_translation_error_detection`
- `snarks`, `sports_understanding`, `temporal_sequences`
- `tracking_shuffled_objects_five_objects`, `tracking_shuffled_objects_seven_objects`
- `tracking_shuffled_objects_three_objects`, `web_of_lies`

## Algorithms

### FedPOB
Combines federated learning with LinUCB for collaborative prompt optimization:

1. **Multi-agent LinUCB**: Each agent uses Linear Upper Confidence Bound for prompt selection
2. **Federated Aggregation**: Agents share learned parameters while maintaining data privacy
3. **Sentence Embeddings**: Uses sentence-transformers for prompt representation
4. **Automatic Evaluation**: Integrates with language models for prompt scoring

### FedPOB-Pref
Extends FedPOB with preference-based learning using dueling bandits:

1. **Dueling Bandits**: Uses pairwise comparisons between prompts instead of absolute scores
2. **Preference Feedback**: Learns from relative preferences rather than direct evaluations
3. **Federated  Regularization Aggregation**: Advanced federated averaging with regularization
4. **Collaborative Learning**: Multiple clients optimize prompts while preserving privacy

## Requirements

- Python 3.8+
- PyTorch 1.9+
- Transformers library
- Sentence-transformers
- OpenAI or OpenRouter API access
- See `requirements.txt` for complete dependencies

## Results

Results are saved to `all_results/FedPOB/` directory organized by model type, containing performance metrics and selected prompts for each agent.
