# Non-Incremental Compilation of Probabilistic Answer Set Programming

---

## Table of Contents

1. [Dependencies](#dependencies)
2. [Pipeline Overview](#pipeline-overview)
3. [Automated Scripts](#automated-scripts)
4. [Experimentation](#experimentation)
5. [Program Generation](#program-generation)
6. [File Outputs](#file-outputs)
7. [How to Use](#how-to-use)
8. [Code Modules](#code-modules)

---

### Dependencies

To run the article's pipeline, you need the following dependencies managed via the **UV package manager**:

#### Python Version
Ensure you are using Python version `3.13`, as specified in `.python-version`.

#### Required Packages
The dependencies are already listed in the `pyproject.toml` file:
- **clingo**: `>=5.7.1`
- **ipykernel**: `>=6.29.5`
- **jinja2**: `>=3.1.5`
- **matplotlib**: `>=3.10.0`
- **memory-profiler**: `>=0.61.0`
- **networkx**: `>=3.4.2`
- **numpy**: `>=2.2.2`
- **pandas**: `>=2.2.3`
- **pyqt6**: `>=6.8.0`
- **pysdd**: `>=1.0.0`
- **seaborn**: `>=0.13.2`
- **tqdm**: `>=4.67.1`

#### Installation
To install the dependencies, use UV:
```sh
uv sync
```

#### Running Scripts
You can either source the virtual environment created by UV or run scripts directly:
- **Source the virtual environment**:
  ```sh
  source .uv/venv/bin/activate
  python <script_name.py>
  ```
- **Run with UV**:
  ```sh
  uv run <script_name.py>
  ```

UV handles everything for you, ensuring a smooth setup and execution process.

#### Top-Down Compilers

This repository does not include binaries for the top-down compilers used in the pipeline. These compilers must be downloaded separately from their respective sources. Below are the links to obtain them:

- **C2D**: [http://reasoning.cs.ucla.edu/c2d/](http://reasoning.cs.ucla.edu/c2d/)
  Note that C2D is not open-source software, and its binaries are provided by the authors at the linked website. Due to licensing restrictions, we cannot include C2D in this repository.

- **SharpSAT-TD**: [https://github.com/raki123/sharpsat-td](https://github.com/raki123/sharpsat-td)
  SharpSAT-TD is an open-source tool for top-down compilation and a modified version for PASP inference used in the ASPMC's [https://www.sciencedirect.com/science/article/pii/S0004370224000456](article) can be found in the given GitHub repository.

- **D4**: [https://github.com/raki123/d4](https://github.com/raki123/d4)
  Similarly, D4 is another open-source top-down compiler with a compatible PASP-inference version used in the ASPMC's [https://www.sciencedirect.com/science/article/pii/S0004370224000456](article).

#### Datasets

The Bitcoin trust network dataset is available at the Stanford Network Analysis Project (SNAP) [https://snap.stanford.edu/data/soc-sign-bitcoin-alpha.html](website). For our analysis, we downloaded the dataset and saved it in the `plp/datasets` directory.

---

## Pipeline Overview

The pipeline consists of the following steps:

### 1. **Parsing Programs**
The `rule_parser.py` script parses PASP programs (`.pasp` files) into intermediate representations:
- **CNF**: A conjunctive normal form representation.
- **JSON**: A structured representation of the program.

### 2. **Applying Heuristics (Optional)**
Heuristics reorder atoms in the program to optimize compilation:
- **Initialization Heuristic (`initialization_heuristic.py`)**: Reorders atoms based on dependency graph descendants.
- **Minfill Heuristic (`minfill.py`)**: Reorders atoms to minimize graph fill-in edges.
- **Mindegree Heuristic (`mindegree.py`)**: Reorders atoms to minimize node degrees.

### 3. **Non-Incremental Transformation (Optional)**
The `non_incremental_heuristic.py` script splits rules into disjoint groups based on shared atoms, enabling efficient compilation.

### 4. **Compilation**
The `sdd_compiler.py` script compiles the program into PSDDs, leveraging the structured JSON representation.

---

## Automated Scripts

### **Cleanup Script**
The `cleanup_programs.fish` script removes unnecessary files and directories from the `plp/programs` folder, keeping only `.pasp` files.

### **Parse Script**
The `parse_programs.fish` script automates the pipeline:
1. Parses `.pasp` files into CNF and JSON formats.
2. Applies heuristics (`initialization`, `minfill`, `mindegree`).
3. Splits rules into disjoint groups using the non-incremental heuristic.

### **Experiment Scripts**
These scripts automate experiments with different compilation tools:
- **`sdd_experiments.py`**: Runs experiments using the `sdd_compiler.py`.
- **`d4_experiments.py`**: Runs experiments using the `d4` tool.
- **`c2d_experiments.py`**: Runs experiments using the `c2d` tool.
- **`sharpsat_experiments.py`**: Runs experiments using the `SharpSAT-TD` tool.

---

## Program Generation

The `plp/scripts` folder contains scripts to generate PASP programs for various problems:
- **`make_coloring.py`**: Generates graph coloring problems.
- **`make_food.py`**: Generates food preference problems with unrolled constraints.
- **`make_food_seq.py`**: Generates food preference problems with sequential counters.
- **`make_food_totalizer.py`**: Generates food preference problems using totalizer encoding.
- **`make_hmm.py`**: Generates Hidden Markov Model-like problems.
- **`make_irl.py`**: Generates IRL (Inverse Reinforcement Learning) problems.
- **`make_irn.py`**: Generates IRN (Inverse Reinforcement Network) problems.
- **`make_pin.py`**: Generates Probabilistic Interaction Network problems.
- **`make_pin_loop.py`**: Generates PIN problems with loops.
- **`make_queens.py`**: Generates N-Queens problems.

---

## File Outputs

### **Parsing Programs**
- **Input**: `.pasp` file.
- **Output**:
  - `.cnf`: CNF representation of the program.
  - `.json`: JSON representation of the program.

### **Heuristics**
- **Input**: JSON file.
- **Output**:
  - `_init.json`: Reordered JSON using initialization heuristic.
  - `_minfill.json`: Reordered JSON using minfill heuristic.
  - `_mindegree.json`: Reordered JSON using mindegree heuristic.

### **Non-Incremental Transformation**
- **Input**: JSON file.
- **Output**:
  - `_non-incremental.json`: JSON with disjoint rule groups.

### **Compilation**
- **Input**: JSON file.
- **Output**:
  - PSDD files (specific format depends on the compilation tool).

### **Experiment Results**
- **Output**: `.csv` files containing metrics like circuit size, model count, compression rate, and compilation time.

---

## How to Use

### **Pipeline Execution**
1. **Manually**:
   ```sh
   uv run rule_parser.py plp/programs/name_of_program/name_of_program.pasp
   uv run initialization_heuristic.py plp/programs/name_of_program/name_of_program.json
   uv run non_incremental_heuristic.py plp/programs/name_of_program/name_of_program.json
   uv run sdd_compiler.py plp/programs/name_of_program/name_of_program.json
   ```
2. **Automated**: Run the `parse_programs.fish` script:
   ```sh
   ./parse_programs.fish
   ```
   **Cleanup**: Run the `cleanup_programs.fish` script:
   ```sh
   ./cleanup_programs.fish
   ```
   **Experiments**: Run experiment scripts:
   ```sh
   python sdd_experiments.py base_path programs_file output_dir
   python d4_experiments.py base_path programs_file output_dir
   python c2d_experiments.py base_path programs_file output_dir
   python sharpsat_experiments.py base_path programs_file output_dir
   ```
   **Program Generation**: Generate PASP programs:
   ```sh
   python plp/scripts/make_coloring.py beginning end --bitcoin bitcoin_dataset.csv
   python plp/scripts/make_food.py n_people_start n_people_end m_food
   python plp/scripts/make_hmm.py beginning end
   ```

---

## Code Modules

### **Core Modules**
- **`rule_parser.py`**: Parses PASP programs into CNF and JSON formats.
- **`sdd_compiler.py`**: Compiles PASP programs into PSDDs.
- **`initialization_heuristic.py`**: Applies initialization heuristic.
- **`minfill.py`**: Applies minfill heuristic.
- **`mindegree.py`**: Applies mindegree heuristic.
- **`non_incremental_heuristic.py`**: Splits rules into disjoint groups.

### **Experiment Modules**
- **`sdd_experiments.py`**: Runs experiments with `sdd_compiler.py`.
- **`d4_experiments.py`**: Runs experiments with `d4`.
- **`c2d_experiments.py`**: Runs experiments with `c2d`.
- **`sharpsat_experiments.py`**: Runs experiments with `SharpSAT-TD`.

### **Utility Scripts**
- **`cleanup_programs.fish`**: Cleans up unnecessary files.
- **`parse_programs.fish`**: Automates the pipeline.

### **Program Generation Scripts**
Located in `plp/scripts`, these scripts generate PASP programs for various problems.

---

## Research Context

This repository is designed for **research purposes**. It provides tools to explore PASP compilation and optimization techniques. Each module is crafted to be modular and extensible, enabling experimentation and adaptation for diverse use cases.
