## LLM Compose
Zhuoyan Xu*, Zhenmei Shi*, Yingyu Liang

This repository is contains the benchmark and associate code in paper
[Do Large Language Models Have Compositional Ability? An Investigation into Limitations and Scalability](https://openreview.net/forum?id=iI1CzEhEMU). 

## Requirements

It is tested under Ubuntu Linux 20.04 and Python 3.11 environment and requires some packages to be installed.

 - Pytorch >= 1.12.1 (guide is [here](https://pytorch.org/get-started/locally/))
 - Transformer >= 4.37


### Usage

##### Get Started

- To evaluate the model on logical tasks, run:

```bash
bash run_logic.sh
```


<!-- ### Citing -->

<!-- If you find our code useful, please consider citing:

```
@inproceedings{
    xu2024towards,
    title={Towards Few-Shot Adaptation of Foundation Models via Multitask Finetuning},
    author={Zhuoyan Xu and Zhenmei Shi and Junyi Wei and Fangzhou Mu and Yin Li and Yingyu Liang},
    booktitle={The Twelfth International Conference on Learning Representations},
    year={2024},
    url={https://openreview.net/forum?id=1jbh2e0b2K}
}
``` -->
