## Project Description

**TrustLLM_en** is a system for evaluating and monitoring machine learning models in the field of natural language processing. The project includes tools for downloading and processing data, launching inference models, collecting metrics, and visualizing the state of experiments via a web interface.

## Content

- [Project Architecture](#project architecture)
- [Project structure](#project structure)
- [Installation](#installation)
- [Setup](#setup)
- [Project launch](#project launch) 
  - [Backend launch](#backend launch)
  - [Task Handler launch](#task handler launch)
  - [Launching the metric handler](#launching the metric handler)
  - [Launching monitoring](#launching monitoring)
- [Usage](#usage)
- [Contribution to the project](#contribution-to-the-project)
- [License](#license)

## Project architecture

The inference architecture includes the following components:

1. **MongoDB** is a database for storing tasks and results.
2. **LangchainBackend** is a server part for processing requests and interacting with models. (can go to Ollama/Yandex/OpenAI)
3. **StreamlitMonitoring** is a graphical interface for displaying metrics, the current calculation status using LLM, and launching model calculations from the graphical interface 
4. **TaskRunner** is a program that constantly bypasses the entire `MongoDB` system in search of tasks with the `pending` status and sends them to `LangchainBackend`, after which LLM adds the answers to `MongoDB 
5. **MetricRunner** is a program that constantly crawls some tables in `MongoDB` in search of tasks with the status `completed`, runs regular updates on them, compares the result found and uploads metrics to 'MongoDB`

## Project structure


```plaintext
.
├── benchmark
│   ├── download_data
│   │   ├── ethics.py   
│   │   └── ruhatespeech.py
│   ├── runers
│   │   ├── add_collumn_collection.py
│   │   ├── add_dataset_field.py   
│   │   ├── drop_collection_pattern.py      
│   │   ├── drop_not_in_models.py     
│   │   ├── drop_pending.py        
│   │   ├── fix_rta_tasks.py         
│   │   ├── revert_old.py         
│   │   ├── revert_rta.py         
│   │   ├── run_metric.py         
│   │   ├── run_regexp.py         
│   │   ├── run_rta_queuer.py     
│   │   ├── run_sync.py       
│   │   ├── task_processor.py     
│   │   ├── update_target.py    
│   │   ├── README.md   
│   │   └── run.py                 
│   └── tasks
│       ├── ethics.py
│       ├── exaggerated_safety.py
│       ├── jailbreak.py
│       ├── misuse.py 
│       ├── new_ethics.py
│       ├── ood.py
│       ├── privacy_awareness.py
│       ├── rubia.py 
│       ├── ruhatespeech.py
│       └── slava.py             
├── data
│   └──init_db.py
├── langchain_back
│   ├── main.py                      
│   ├── README.md
│   └── test.py                     
├── monitoring
│   ├── tasks.py
│   ├── src.py
│   ├── prompts_tasks.py
│   ├── metrics.py
│   ├── dataset_management.py
│   ├── config.yaml
│   └── app_main.py     
├── utils
│   ├── constants.py
│   ├── src.py
│   ├── db_client.py
│   └── sync_task.py                    
├── requirements.txt                     
├── pyproject.toml                   
└── README.md                        
```

## Installation

### Requirements

- **Python 3.11** or higher
- **MongoDB** is installed and running
- **Poetry** for dependency management
- **Ollama** for using local models

### Installation Steps

1. **Clone the repository:**

    ```bash
    git clone <this_repo>
    cd TrustLLM_ru
    ```

2. **Install Poetry (if not already installed):**

    Follow the [official instructions](https://python-poetry.org/docs/#installation ) to install Poetry.

3. **Install dependencies:**

    ```bash
    poetry install
    ```

4. **Create a `.env` file in the root of the project and add the necessary environment variables:**

    ```dotenv
    MONGO_INITDB_ROOT_USERNAME=username
    MONGO_INITDB_ROOT_PASSWORD=password
    MONGO_INITDB_ROOT_PORT=port
    MONGO_HOST=host
    YANDEX_API_KEY=your_key
    YANDEX_MODEL_URI=your_uri
    YANDEX_BASE_URL=https://llm.api.cloud.yandex.net/v1
    API_URL=http://langchain_backend:45321/generate
    OPENAI_KEY=your_openai_key
    OPENAI_BASE_URL=<optional>
    OLLAMA_BASE_URL=http://host.docker.internal:12345
    ```

    Replace the values with your own.

## Setting up

Make sure that MongoDB is running and accessible using the parameters specified in `.env`. Also make sure that the API keys and other environment variables are configured correctly.


## Project Launch

### I. Docker Launch

`docker compose -f docker-compose.yml up -d --build`

### II. Screener Launch

### Backend Launch

The backend is responsible for processing requests and interacting with models.

1. **Create `screen` session:**

    ```bash
    screen -S langchain_back
    ```

2. **Activate venv:**

    ```bash
    source .venv/bin/activate
    ```

3. **Launch backend:**

    ```bash
    python langchain_back/main.py
    ```

### Launching the task handler

The task handler is responsible for executing tasks stored in MongoDB.

```bash
screen -S my_session

path_to_project/TrustLLM_ru/.venv/bin/python path_to_project/TrustLLM_ru/benchmark/runers/run.py

screen -r my_session
```

### Launching the handler for adding tasks to the RtA queue

The handler adds tasks from all collections in MongoDB to the RtA queue for further processing.


```bash
screen -S run_rta

path_to_project/TrustLLM_ru/.venv/bin/python path_to_project/TrustLLM_ru/benchmark/runers/run_rta.py

screen -r run_rta
```

### Launching the Metric measurement handler

This handler is responsible for processing tasks in the RtA collection and calculating metrics.

```bash
screen -S run_metric

path_to_project/TrustLLM_ru/.venv/bin/python path_to_project/TrustLLM_ru/benchmark/runners/run_metric.py

screen -r run_metric
```


### Launching task_runner

It looks at which tasks are in the tasks table and based on them checks the existence of up-to-date records and creates appropriate queues.

```bash
screen -S task_runner

path_to_project/TrustLLM_ru/.venv/bin/python path_to_project/TrustLLM_ru/benchmark/runers/task_processor.py

screen -r task_runner
```

### Launching the extractor from LLM responses

It looks at which tasks are in the tasks table and based on them checks the existence of up-to-date records and creates appropriate queues.

```bash
screen -S run_regexp

path_to_project/TrustLLM_ru/.venv/bin/python path_to_project/TrustLLM_ru/benchmark/runers/run_regexp.py

screen -r run_regexp
```


### Launching monitoring the second version

A new version of monitoring that allows you to select a metric and display the corresponding tables based on it.

```bash
screen -S monitoring2

path_to_project/TrustLLM_ru/.venv/bin/python -m streamlit run path_to_project/TrustLLM_ru/monitoring/app_streamlit.py --server.port port

screen -r monitoring2
```

## Usage

### Launching all components

For the system to work properly, all the main components must be started.:

1. **Backend**: Query processing and interaction with models.
2. **Task Handler**: Retrieving and executing tasks from MongoDB.
3. **Metric Handler**: Collecting and calculating metrics based on completed tasks.
4. **Monitoring**: A web interface for monitoring the status of experiments.

### Step-by-step launch

1. **Run MongoDB** on your server or local machine.
2. **Configure the environment variables** in the `.env` file.
3. **Activate the virtual environment** and install dependencies using Poetry.
4. **Launch all components** according to the instructions in the [Project Launch] section (#project launch).

### Adding tasks

To add tasks for processing:

1. **Use scripts from `benchmark/src.py` , for example, `load_task_mango`.**
2. **Specify the models and flavors** that need to be processed.
3. **Tasks will be automatically processed by the task handler** and the results will be saved in MongoDB.


