# Dataset Preparation

## Label Preparation

### Format of Each File

It is recommended to organize the dataset in a format similar to "repcount" so that the same Dataset file (repcount_dataset.py) can be used.

Format of "repcount":

![Untitled](img/datasets_label.png)

The first two columns are row numbers and are not important.

The third column represents the type of motion and is not important, you can write anything.

The fourth column represents the video name.

The fifth column represents the total count of cycles.

The sixth column represents the total number of frames.

The seventh column and beyond represent the start and end frame numbers of each cycle.

### Folder Structure

Choose a folder and place three files inside it: train.csv, valid.csv, and test.csv.

# Feature Extraction for Training

As there is currently no backbone integration, we need to extract features from the video files.

## Extract Raw Frames

```bash
# new-short resizes the shorter edge to 256
python tools/data/build_rawframes.py /path/to/video/folder/train /path/to/save/frames/folder/train --level 1 --ext mp4 --task rgb --new-short 256 --use-opencv
python tools/data/build_rawframes.py /path/to/video/folder/valid /path/to/save/frames/folder/valid --level 1 --ext mp4 --task rgb --new-short 256 --use-opencv
python tools/data/build_rawframes.py /path/to/video/folder/test /path/to/save/frames/folder/test --level 1 --ext mp4 --task rgb --new-short 256 --use-opencv
```

## Get File Names

```bash
# Modify the paths inside
python tools/data/build_label_list.py
```

## Extract Features

```bash
CUDA_VISIBLE_DEVICES=7 python tools/data/activitynet/tsn_feature_extraction.py --data-list ./datasets/LLSP/temp/annt_file_train.txt --output-prefix ./datasets/LLSP/feature-frame/train/ --modality RGB --ckpt ./checkpoints/tsn_r50_320p_1x1x3_100e_kinetics400_rgb_20200702-cc665e2a.pth --frame-interval 1
CUDA_VISIBLE_DEVICES=3 python tools/data/activitynet/tsn_feature_extraction.py --data-list ./datasets/LLSP/temp/annt_file_test.txt --output-prefix ./datasets/LLSP/feature-frame/test/ --modality RGB --ckpt ./checkpoints/tsn_r50_320p_1x1x3_100e_kinetics400_rgb_20200702-cc665e2a.pth --frame-interval 1
CUDA_VISIBLE_DEVICES=7 python tools/data/activitynet/tsn_feature_extraction.py --data-list ./datasets/LLSP/temp/annt_file_valid.txt --output-prefix ./datasets/LLSP/feature-frame/valid/ --modality RGB --ckpt ./checkpoints/tsn_r50_320p_1x1x3_100e_kinetics400_rgb_20200702-cc665e2a.pth --frame-interval 1
```

## Merge into H5

```bash
python React/combine_pkl_to_h5.py
```

# Prepare Training Configuration File

Make a copy of React/config/repcount_tsn_feature.py and make the following modifications:

```bash
# Dataset settings
dataset_type = "RepCountDataset"
data_root_train = "./datasets/LLSP/feature/train_rgb.h5"
data_root_val = "./datasets/LLSP/feature/valid_rgb.h5"
data_root_test = "./datasets/LLSP

/feature/test_rgb.h5"
flow_root_train = None
flow_root_val = None

ann_file_train = "./datasets/LLSP/annotation/train_new.csv"
ann_file_val = "./datasets/LLSP/annotation/valid_new.csv"
ann_file_test = "./datasets/LLSP/annotation/test_new.csv"

# Work directory
work_dir = "./tmp"
```

# Training

```bash
CUDA_VISIBLE_DEVICES=7 python tools/train.py DetTRC/config/repcount_tsn_feature.py --validate
```