## Data
LongProc consists of 6 tasks. Each tasks generally includes three difficulty levels with maximum numbers of output tokens set at 500, 2K, and 8K. The 6 tasks are included as follows:
* `html_to_tsv`(HTML TO TSV): Extract specified information from HTML pages and structure it into a table format (TSV)
* `pseudo_to_code` (Pseudocode to Code): Translate pseudocode that is structured line-by-line into corresponding C++ code.
* `path_traversal` (Path Traversal): Traverse a route that connects two cities in a graph where each city has only one outgoing connection.
* `tom_tracking` (Theory-of-Mind Tracking): Track the locations and beliefs in stories about object placement asked in the question.
* `countdown` (Countdown): Search to combine a set of numbers with basic arithmetic
operations to reach a target number.
* `travel_planning` (Travel Planning): Search to create a trip plan based on constraints regarding duration of stays, and direct flights.

### Example
Please install the necessary packages with `pip install -r requirements.txt`.

```bash
python example_usage.py --dataset path_traversal_0.5k
# dataset names are specified as [task_name]_[length]
```

### Loading Data and Evaluation Function
Call `load_longproc_data` in `longproc.longproc_data`. The function returns:
* A list of data points, each is a dict with `input_prompt` (a string of the prompt) `reference_output` (the ground truth procedure trace), and `item` (some meta info for the data point).
* The corresponding evaluation function for the task. A evaluation function (e.g. `eval_path_traversal` in `longproc.longproc_data`), will take in the prediction (a string) and the data point, and returns: 1) metrics, and 2) additional information such as parsed outputs or brief descriptions of the errors.
