# LLM Model Counting

This project uses Large Language Models (LLMs) to perform model counting on Sudoku problems encoded using propositional logic.

## What is Model Counting?

Model counting, also known as #SAT, is the problem of counting the number of satisfying assignments of a given Boolean formula. It is a fundamental problem in computer science with applications in various fields, including artificial intelligence, hardware verification, and bioinformatics.

## Installing Dependencies

We recommend you use [uv](https://docs.astral.sh/uv/getting-started/installation/). Then you can run:

```bash
uv sync
```

Otherwise, you can use `conda` or `virtualenv` as usual.

## API Keys

We support several frameworks (i.e., APIs to communicate with the LLMs). For each API, you need to set up its related environment variable with a valid API key:

-   for the Google Gemini API you must define `GOOGLE_API_KEY`
-   for the DeepSeek API you must define `DEEPSEEK_API_KEY`
-   for the OpenAI API you must define `OPENAI_API_KEY`

## Usage

This project provides two main approaches for model counting using LLMs:

1.  **End-to-End Model Counting**: The LLM directly calculates the number of models of the given formula.
2.  **Code Generation for Model Counting**: The LLM generates a Python script that, when executed, computes the number of models.

### End-to-End Model Counting

The main script for this approach is `model-counting.py`.

#### Command-Line Options

The basic command is:

```bash
uv run model-counting.py -i INSTANCE
```

where `INSTANCE` is the path to the instance you want to solve.

There are several other options you can pass to the command line:

-   `--instance` or `-i`: path to the instance you want to solve (required).
-   `--framework`: choose what framework you want to use. Available choices are: `gemini` (default), `deepseek`, and `openai`.
-   `--model` or `-m`: choose the LLM model to be used. The default is `gemini-2.0-flash-thinking`. The available models depend on the selected framework.
-   `--temperature` or `-t`: temperature (default: 0.6). Must be in the interval [0, 2].
-   `--top-p`: top-p value (default: 0.95). Must be in the interval [0, 1].
-   `--prompt-format`: format of the prompt. Available choices are: `base` (default), `sudoku_free` (generic prompt).
-   `--only-prompt`: exit program after displaying prompt.

#### Example

Let's say you want to solve an instance called `instance.cnf` using `gemini-2.5-pro` and the `sudoku_free` prompt format. You would run:

```bash
uv run model-counting.py -i instance.cnf --framework gemini --model gemini-2.5-pro --prompt-format sudoku_free
```

### Code Generation for Model Counting

The `code-generation.py` script uses an LLM to generate a Python script to solve the model counting problem.

#### Command-Line Options

-   `--instance` or `-i`: path to the instance you want to solve (required).
-   `--code-file`: file where the generated code will be written (required).
-   `--framework`: choose what framework you want to use. Available choices are: `gemini` (default), `deepseek`, and `openai`.
-   `--model` or `-m`: choose the LLM model to be used. The default is `gemini-2.0-flash-thinking`.
-   `--temperature` or `-t`: temperature (default: 0.6). Must be in the interval [0, 2].
-   `--top-p`: top-p value (default: 0.95). Must be in the interval [0, 1].
-   `--only-prompt`: exit program after displaying prompt.
-   `--dimacs`: use prompt asking to generate code for DIMACS format (otherwise it will assume d-DNNF).

#### Example

To generate a Python script `solve.py` for `instance.cnf` using `gemini-2.5-pro`, you would run:

```bash
uv run code-generation.py -i instance.cnf --code-file solve.py --framework gemini --model gemini-2.5-pro
```

The generated script `solve.py` is then executed to get the model count.

## Permutating Instances

The `permutate.py` script allows you to generate new, equivalent instances by permutating the variables and clauses of a DIMACS CNF file. This is useful for testing and benchmarking.

### Command-Line Options

-   `input`: Input DIMACS CNF file.
-   `-o` or `--output`: Output DIMACS CNF file (default: `/dev/stdout`).
-   `--permute-clauses`: Permute the order of clauses.
-   `--seed`: Random seed for reproducibility.

### Example

To permutate the variables and clauses of `instance.cnf` and save the result to `permuted.cnf`, you would run:

```bash
python permutate.py instance.cnf -o permuted.cnf --permute-clauses --seed 42
```

## Available Models

The available models for each framework are defined in `src/models.py`. Here is a list of the currently supported models:

-   **Gemini**: `gemini-2.5-flash`, `gemini-2.5-pro`
-   **DeepSeek**: `deepseek-reasoner`

## Project Structure

-   `model-counting.py`: The main script for running the end-to-end model counting experiments.
-   `code-generation.py`: The script for generating Python code for model counting.
-   `permutate.py`: A utility script for permutating DIMACS CNF instances.
-   `src/`: Contains the source code for the project.
    -   `models.py`: Contains the logic for interacting with the different LLM APIs.
    -   `prompt.py`: Contains the logic for creating the prompts.
    -   `templates.py`: Contains the prompt templates.
    -   `answers.py`: Contains a dictionary of known answers for the small Sudoku instances for easy benchmarking.
    -   `suites.py`: Contains lists of instances for different benchmarks.
