# Wyze Rule Recommendation Dataset

## Introduction
The Wyze Rule Recommendation Dataset is a subset of a larger dataset containing information about rules set by Wyze users. This dataset has been completely anonymized to ensure the privacy of our users. The dataset is intended to be used only by researchers who agreed to the [terms of use](https://forms.gle/FS5n8uVW4gEX4Mor6). With due consideration, it is hereby stated that this smaller dataset is being shared solely for the purpose of review and must not be disclosed beyond the confines of this intended objective.

## Contents
The dataset contains two files:

- Rules Dataset (`rule.csv`): This file contains information about the rules that control the behavior of Wyze smart home devices. Each row represents a single rule, and includes the `user_id`, `trigger_device`, `trigger_device_id`, `trigger_state`, `action_device`, `action_device_id`, and `action`.
- Devices Dataset (`device.csv`): This file contains information about the devices that the user owns. Each row represents a single device, and includes the `user_id`, `device_id`, and `device_model`.

## Usage
This dataset can be used for various purposes including developing and testing a rule recommendation system that can be used to recommend new rules to the users.

## Data Format
The data in the rules dataset is organized in the following columns:

- `user_id`: Unique identifier for the user
- `trigger_device`: The device that triggers the rule
- `trigger_device_id`: Unique identifier for the trigger device
- `trigger_state`: The state of the trigger device that causes the rule to be executed
- `action_device`: The device that performs the action
- `action_device_id`: Unique identifier for the action device
- `action`: The action that is performed when the rule is executed

The data in the devices dataset is organized in the following columns:

- `user_id`: Unique identifier for the user
- `device_id`: Unique identifier for the device
- `device_model`: The model of the device

### Note:
- The device dataset is provided since there are some devices not included in any rule, but owned by users. Having this dataset, will make the graph of users complete.

- This dataset is a subset of the original dataset. It contains the data of 2000 users with more than 6500 rules. For accessing the full dataset you need to sign the consent form to agree to [terms of use](https://forms.gle/FS5n8uVW4gEX4Mor6).

## Experiments
You can run the experiments in the paper using the following scripts. Note that, this will run the experiments for the subset dataset and not the full dataset

### Matrix Factorization
```cli
python matrix_factorization.py
```

### GraphRule
```cli
python main.py -ce -lr 0.1 -m graphsage -c 100
```


### FedAvg
```cli
python main.py -lr 0.1 -i 3 -c 100
```

## Contact Information
If you have any questions or concerns about the dataset, please contact [mkamani@wyze.com](mailto:mkamani@wyze.com).