# ScreenExplorer-Review

A reinforcement learning framework for screen-based environment exploration and interaction.

## Overview

ScreenExplorer is a framework that enables AI agents to explore and interact with screen-based environments using reinforcement learning techniques. The system leverages a world model approach to navigate through graphical user interfaces.

## Demos

Watch our agent in action:

**3B Model Performance:** 
[ScreenExplorer-3B-E1 Video](https://lhcos-caf6d-1302173427.cos.ap-singapore.myqcloud.com/ScreenExplorer-3B-E1.mov)

**7B Model Performance:**  
[ScreenExplorer-7B-E1 Video](https://lhcos-caf6d-1302173427.cos.ap-singapore.myqcloud.com/ScreenExplorer-7B-E1.mov)

## Project Structure

- `exploration_reward.py`: Implements reward functions for exploration
- `modeling_llama_world_model.py`: Contains the LLaMA-based world model implementation
- `rollout_buffer.py`: Manages experience rollouts for training
- `train_explorer.py`: Main training script for the explorer agent
- `utils.py`: Utility functions
- `screen_env/`: Module for screen environment interactions
  - `asyncvnc.py`: Asynchronous VNC client for screen interaction
  - `screen_env.py`: Environment wrapper for screen-based interaction

## Usage

To train an explorer agent:

```
python train_explorer.py
```
