# WebTester

A Python class for automated website testing using Playwright. The WebTester can start a website service, control a browser, execute predefined actions, and validate results.

There are two implementations:

1. `WebTester` - A simpler version that executes predefined actions
2. `WebAgentTester` - A more advanced version that implements an agent loop with bounding boxes and can integrate with multimodal models

## Features

- Starts a website service in a specified directory using a custom command
- Controls a Chromium browser using Playwright
- Executes predefined actions:
  - Click [Numerical_Label]
  - Type [Numerical_Label]; [Content]
  - Scroll [Numerical_Label or WINDOW]; [up or down]
  - Wait
  - GoBack
  - ANSWER; [content]
  - Drag [Coordinate_Begin] [Coordinate_End]
- Validates test results against expected outcomes
- Agent loop implementation with bounding boxes on interactive elements
- Integration with multimodal models (OpenAI GPT-4V) for decision making

## Installation

Make sure you have Python 3.7+ installed, then install the required dependencies:

```bash
pip install -r requirements.txt
playwright install chromium
```

## Usage

### Simple WebTester

```python
from web_tester import WebTester

# Create a tester instance
tester = WebTester(
    directory_path="/path/to/website",
    start_command="python -m http.server 8000",
    instruction="""Click 1
Type 2; Hello World
Click 3""",
    expected_result="Form submitted successfully"
)

# Run the test
result = tester.run_test()
print(f"Test {'PASSED' if result else 'FAILED'}")
```

### Advanced WebAgentTester

```python
from web_agent_tester import WebAgentTester

# Create a tester instance with OpenAI API key for model integration
tester = WebAgentTester(
    directory_path="/path/to/website",
    start_command="python simple_server.py",
    instruction="Find the contact information on the website",
    expected_result="Contact information found",
    api_key="your-openai-api-key"  # Optional, if not provided, uses simulated responses
)

# Run the test
result = tester.run_test()
print(f"Test {'PASSED' if result else 'FAILED'}")
```

## Predefined Actions

The instruction string can contain multiple actions separated by newlines:

- `Click 1` - Clicks on an element with data-label="1"
- `Type 2; Hello World` - Types "Hello World" into an element with data-label="2"
- `Scroll WINDOW; down` - Scrolls the window down
- `Scroll 1; up` - Scrolls an element with data-label="1" up
- `Wait` - Waits for 1 second
- `GoBack` - Goes back in browser history
- `ANSWER; content` - Provides a test answer
- `Drag [100,200] [300,400]` - Drags from coordinates (100,200) to (300,400)

## Agent Loop Implementation

The `WebAgentTester` implements an agent loop that:

1. Captures screenshots of the web page
2. Adds bounding boxes with numerical labels to interactive elements
3. Sends the annotated screenshot to a multimodal model (if API key is provided)
4. Executes the action returned by the model
5. Repeats until the task is completed or max iterations is reached

## Example

See `test_webtester.py` and `test_web_agent_tester.py` for complete examples of how to use the testers.

## Testing Your Website

To test your website, make sure your HTML elements have `data-label` attributes that correspond to the numerical labels used in your instructions:

```html
<button data-label="1">Click Me</button>
<input data-label="2" type="text" placeholder="Enter text here">
<button data-label="3">Submit</button>
```