## `100driver-sf3d-nc10` dataset creation

Prepare mapping for 100-driver on to sf3d to create `100driver-sf3d-nc10` dataset.

```bash
bash scripts/prepare_100driver_sf3d.sh
```

Output are the splits, indices, mosiac, plots and summary, it uses indices and same image files from the full dataset repository,

  ```bash
  logs/100driver-sf3d-<ddmmyy_hhmmss>/
  ├── D2_test_sf3d_nc10.txt
  ├── D2_train_sf3d_nc10.txt
  ├── D2_val_sf3d_nc10.txt
  ├── indices
  │    ├── all_index.csv
  │    ├── test_index.csv
  │    ├── train_index.csv
  │    └── val_index.csv
  ├── mosaic.png
  ├── plots
  │    ├── split_distribution.png
  │    ├── test_orig_vs_remapped_raw.png
  │    ├── test_orig_vs_remapped_weighted.png
  │    ├── test_per_class_remapped.png
  │    ├── test_per_vehicle.png
  │    ├── TOTAL_orig_vs_remapped_raw.png
  │    ├── TOTAL_orig_vs_remapped_weighted.png
  │    ├── TOTAL_per_class_remapped.png
  │    ├── train_orig_vs_remapped_raw.png
  │    ├── train_orig_vs_remapped_weighted.png
  │    ├── train_per_class_remapped.png
  │    ├── train_per_vehicle.png
  │    ├── val_orig_vs_remapped_raw.png
  │    ├── val_orig_vs_remapped_weighted.png
  │    ├── val_per_class_remapped.png
  │    └── val_per_vehicle.png
  └── summary.json
  ```

* Copy or symlink data creating the directory structure as shown below. Following is the entry already existing in the `data/ddd-datasets.yml`, where `100-driver-day-cam2-sf3d-nc10` is the `dataset-id`. It has to be unique value. You can create any number of IDs as long as they are unique.

    ```yaml
    100-driver-day-cam2-sf3d-nc10:
      loadertxt: 100-driver/Day/Cam2
      valloadertxt: 100-driver/data-splits/100driver-sf3d-nc10/Day/Cam2/D2_val_sf3d_nc10.txt
      testloadertxt: 100-driver/data-splits/100driver-sf3d-nc10/Day/Cam2/D2_test_sf3d_nc10.txt
      trainloadertxt: 100-driver/data-splits/100driver-sf3d-nc10/Day/Cam2/D2_train_sf3d_nc10.txt
    ```


### Dataset variant creation for `100-driver-day-cam2-sf3d-nc10`


```bash
bash scripts/anonflow.100-driver-day-cam2.yolov8s-seg.sh
```

```bash
logs/annotate-<ddmmyy_hhmmss>/
├── 100-driver-day-cam2-annotation
├── 100-driver-day-cam2-bbox
├── 100-driver-day-cam2-mask
├── 100-driver-day-cam2-seg
├── 100-driver-day-cam2-viz
├── modelinfo.json
├── summary.json
├── test
│    ├── imgs_list.csv
│    ├── labels.csv
│    ├── summary.json
│    ├── summary.missed.json
│    └── test.csv
├── train
│    ├── imgs_list.csv
│    ├── labels.csv
│    ├── summary.json
│    ├── summary.missed.json
│    └── train.csv
└── val
    ├── imgs_list.csv
    ├── labels.csv
    ├── summary.json
    ├── summary.missed.json
    └── val.csv
```

Customise the script to create the required symlinks

```bash
vi scripts/datasets/100-driver-sf3d.symlinks.sh
```

Add the entry for `bbox` and `seg` variant in the `data/ddd-datasets.yml` configuration.

```yml
100-driver-day-cam2-sf3d-nc10-seg:
  loadertxt: 100driver-sf3d-nc10-bbox/Day/Cam2/seg
  valloadertxt: 100-driver/data-splits/Traditional-setting/Day/Cam2/D2_val.txt
  testloadertxt: 100-driver/data-splits/Traditional-setting/Day/Cam2/D2_test.txt
  trainloadertxt: 100-driver/data-splits/Traditional-setting/Day/Cam2/D2_train.txt
100-driver-day-cam2-sf3d-nc10-bbox:
  loadertxt: 100driver-sf3d-nc10-bbox/Day/Cam2/bbox
  valloadertxt: 100-driver/data-splits/Traditional-setting/Day/Cam2/D2_val.txt
  testloadertxt: 100-driver/data-splits/Traditional-setting/Day/Cam2/D2_test.txt
  trainloadertxt: 100-driver/data-splits/Traditional-setting/Day/Cam2/D2_train.txt
100-driver-day-cam2-sf3d-nc10:
  loadertxt: 100-driver/Day/Cam2
  valloadertxt: 100-driver/data-splits/100driver-sf3d-nc10/Day/Cam2/D2_val_sf3d_nc10.txt
  testloadertxt: 100-driver/data-splits/100driver-sf3d-nc10/Day/Cam2/D2_test_sf3d_nc10.txt
  trainloadertxt: 100-driver/data-splits/100driver-sf3d-nc10/Day/Cam2/D2_train_sf3d_nc10.txt
```

Re-generate the train/val/test indices as all the images may not have the inference or error and may have fallen off.

* Open the script `scripts/regen_loadertxt.sh` and update the dataset-id for which the indices have to be regerate it, save it.


```bash
vi scripts/regen_loadertxt.sh
```

*  Execute the script.

```bash
bash scripts/regen_loadertxt.sh
```

* Copy the regenerated indices in required dataset split paths. Update the index entry for `bbox` and `seg` variant in the `data/ddd-datasets.yml` configuration for respective dataset-id. Final configuration, should look like below. These already exists in current configuration and you may need to tweak it when generating the dataset variant at your end.

```yml
100-driver-day-cam2-sf3d-nc10-seg:
  loadertxt: 100driver-sf3d-nc10-bbox/Day/Cam2/seg
  valloadertxt: 100-driver/data-splits/100driver-sf3d-nc10/Day/Cam2/D2_val_sf3d_nc10.txt
  testloadertxt: 100-driver/data-splits/100driver-sf3d-nc10/Day/Cam2/D2_test_sf3d_nc10.txt
  trainloadertxt: 100-driver/data-splits/100driver-sf3d-nc10/Day/Cam2/D2_train_sf3d_nc10.txt
100-driver-day-cam2-sf3d-nc10-bbox:
  loadertxt: 100driver-sf3d-nc10-bbox/Day/Cam2/bbox
  valloadertxt: 100-driver/data-splits/100driver-sf3d-nc10/Day/Cam2/D2_val_sf3d_nc10.txt
  testloadertxt: 100-driver/data-splits/100driver-sf3d-nc10/Day/Cam2/D2_test_sf3d_nc10.txt
  trainloadertxt: 100-driver/data-splits/100driver-sf3d-nc10/Day/Cam2/D2_train_sf3d_nc10.txt
```

To create bbox guided segmentation (bboxseg) variant, use previous generated `bbox` dataset variant as the input dataset, keep the rest of the configuration same as used in previous variant. The dataset should already be confiured in `data/ddd-datasets.yml` configuration file.
```bash
bash scripts/anonflow.100-driver-day-cam2-sf3d-nc10-bbox.yolov8s-seg.sh
```

Add the dataset in the dataset configuration file as the `bboxseg` variant, create symlinks/copy data. Repeat the regenration of index step; post re-generation of the indices, copy the indecies to respective dataset directory, re-configure the dataset configuration to point to the newly generated indices. Failing to do so, the images not present will show as annoying errors but training will still happen.

```yml
100-driver-day-cam2-sf3d-nc10-bboxseg:
  loadertxt: 100driver-sf3d-nc10-bboxseg/Day/Cam2/seg
  valloadertxt: 100driver-sf3d-nc10-bboxseg/Day/Cam2/seg/D2_val_sf3d_nc10.txt
  testloadertxt: 100driver-sf3d-nc10-bboxseg/Day/Cam2/seg/D2_test_sf3d_nc10.txt
  trainloadertxt: 100driver-sf3d-nc10-bboxseg/Day/Cam2/seg/D2_train_sf3d_nc10.txt
```


## Label mapping 100-driver-sf3d


Label mapping `data/label_map_100driver_sf3d.json`

```json
{
  "forward": {
    "C1_Drive_Safe": "c0",
    "C7_Text_Right": "c1",
    "C5_Talk_Right": "c2",
    "C6_Text_Left": "c3",
    "C4_Talk_Left": "c4",
    "C18_Operate_Radio": "c5",
    "C16_Eat_Left": "c6",
    "C20_Reach_Behind": "c7",
    "C8_Make_Up": "c8",
    "C22_Talk_to_Passenger": "c9"
  },
  "reverse": {
    "c0": ["C1_Drive_Safe"],
    "c1": ["C7_Text_Right"],
    "c2": ["C5_Talk_Right"],
    "c3": ["C6_Text_Left"],
    "c4": ["C4_Talk_Left"],
    "c5": ["C18_Operate_Radio"],
    "c6": ["C16_Eat_Left"],
    "c7": ["C20_Reach_Behind"],
    "c8": ["C8_Make_Up"],
    "c9": ["C22_Talk_to_Passenger"]
  },
  "labels": {
    "c0": "Normal Driving (C1)",
    "c1": "Texting Right (C7)",
    "c2": "Phone Right (C5)",
    "c3": "Texting Left (C6)",
    "c4": "Phone Left (C4)",
    "c5": "Operate Radio (C18)",
    "c6": "Drink (C16)",
    "c7": "Reach Behind (C20)",
    "c8": "Make Up (C8)",
    "c9": "Talk Passenger (C22)"
  }
}
```


## Dataset labels

dataset_label - is the actual label being used to store the image files

```csv
id,dataset_label,name
C1,C1_Drive_Safe,drive_safe
C2,C2_Sleep,sleep
C3,C3_Yawning,yawning
C4,C4_Talk_Left,talk_left
C5,C5_Talk_Right,talk_right
C6,C6_Text_Left,text_left
C7,C7_Text_Right,text_right
C8,C8_Make_Up,make_up
C9,C9_Look_Left,look_left
C10,C10_Look_Right,look_right
C11,C11_Look_Up,look_up
C12,C12_Look_Down,look_down
C13,C13_Smoke_Left,smoke_left
C14,C14_Smoke_Right,smoke_right
C15,C15_Smoke_Mouth,smoke_mouth
C16,C16_Eat_Left,eat_left
C17,C17_Eat_Right,eat_right
C18,C18_Operate_Radio,operate_radio
C19,C19_Operate_GPS,operate_gps
C20,C20_Reach_Behind,reach_behind
C21,C21_Leave_Steering_Wheel,leave_steering_wheel
C22,C22_Talk_to_Passenger,talk_to_passenger
```


## Dataset Directory Structure

```bash
:100-driver$tree -d
.
├── 100-driver-labels
│ ├── 100-driver-day
│ │ ├── Cam1
│ │ ├── Cam2
│ │ ├── Cam3
│ │ ├── Cam4
│ │ └── data-splits
│ │     ├── Cross-camera-setting
│ │     │ ├── Day
│ │     │ │ ├── Cam1_to_2_3_4
│ │     │ │ ├── Cam2_to_1_3_4
│ │     │ │ ├── Cam3_to_1_2_4
│ │     │ │ └── Cam4_to_1_2_3
│ │     │ └── Night
│ │     │     ├── Cam1_to_2_3_4
│ │     │     ├── Cam2_to_1_3_4
│ │     │     ├── Cam3_to_1_2_4
│ │     │     └── Cam4_to_1_2_3
│ │     ├── Cross-modality-setting
│ │     │ ├── D1_to_N1
│ │     │ ├── D2_to_N2
│ │     │ ├── D3_to_N3
│ │     │ └── D4_to_N4
│ │     ├── Cross-vehicle-setting
│ │     │ ├── Cross-individual-vehicle
│ │     │ │ ├── Day
│ │     │ │ │ ├── Cam1
│ │     │ │ │ ├── Cam2
│ │     │ │ │ ├── Cam3
│ │     │ │ │ └── Cam4
│ │     │ │ └── Night
│ │     │ │     ├── Cam1
│ │     │ │     ├── Cam2
│ │     │ │     ├── Cam3
│ │     │ │     └── Cam4
│ │     │ └── Cross-vehicle-type
│ │     │     ├── Cam1
│ │     │     ├── Cam2
│ │     │     ├── Cam3
│ │     │     └── Cam4
│ │     └── Traditional-setting
│ │         ├── Day
│ │         │ ├── Cam1
│ │         │ ├── Cam2
│ │         │ ├── Cam3
│ │         │ └── Cam4
│ │         └── Night
│ │             ├── Cam1
│ │             ├── Cam2
│ │             ├── Cam3
│ │             └── Cam4
│ ├── 100-driver-nights
│ │ ├── Cam1
│ │ ├── Cam2
│ │ ├── Cam3
│ │ └── Cam4
│ └── 100-driver-nights-sample
│     └── Cam1
├── 100-driver-samples
├── data-splits
│ ├── Cross-camera-setting
│ │ ├── Day
│ │ │ ├── Cam1_to_2_3_4
│ │ │ ├── Cam2_to_1_3_4
│ │ │ ├── Cam3_to_1_2_4
│ │ │ └── Cam4_to_1_2_3
│ │ └── Night
│ │     ├── Cam1_to_2_3_4
│ │     ├── Cam2_to_1_3_4
│ │     ├── Cam3_to_1_2_4
│ │     └── Cam4_to_1_2_3
│ ├── Cross-modality-setting
│ │ ├── D1_to_N1
│ │ ├── D2_to_N2
│ │ ├── D3_to_N3
│ │ └── D4_to_N4
│ ├── Cross-vehicle-setting
│ │ ├── Cross-individual-vehicle
│ │ │ ├── Day
│ │ │ │ ├── Cam1
│ │ │ │ ├── Cam2
│ │ │ │ ├── Cam3
│ │ │ │ └── Cam4
│ │ │ └── Night
│ │ │     ├── Cam1
│ │ │     ├── Cam2
│ │ │     ├── Cam3
│ │ │     └── Cam4
│ │ └── Cross-vehicle-type
│ │     ├── Cam1
│ │     ├── Cam2
│ │     ├── Cam3
│ │     └── Cam4
│ └── Traditional-setting
│     ├── Day
│     │ ├── Cam1
│     │ ├── Cam2
│     │ ├── Cam3
│     │ └── Cam4
│     └── Night
│         ├── Cam1
│         ├── Cam2
│         ├── Cam3
│         └── Cam4
├── Day
│ ├── Cam1
│ │ ├── C10_Look_Right
│ │ ├── C11_Look_Up
│ │ ├── C12_Look_Down
│ │ ├── C13_Smoke_Left
│ │ ├── C14_Smoke_Right
│ │ ├── C15_Smoke_Mouth
│ │ ├── C16_Eat_Left
│ │ ├── C17_Eat_Right
│ │ ├── C18_Operate_Radio
│ │ ├── C19_Operate_GPS
│ │ ├── C1_Drive_Safe
│ │ ├── C20_Reach_Behind
│ │ ├── C21_Leave_Steering_Wheel
│ │ ├── C22_Talk_to_Passenger
│ │ ├── C2_Sleep
│ │ ├── C3_Yawning
│ │ ├── C4_Talk_Left
│ │ ├── C5_Talk_Right
│ │ ├── C6_Text_Left
│ │ ├── C7_Text_Right
│ │ ├── C8_Make_Up
│ │ └── C9_Look_Left
│ ├── Cam2
│ │ ├── C10_Look_Right
│ │ ├── C11_Look_Up
│ │ ├── C12_Look_Down
│ │ ├── C13_Smoke_Left
│ │ ├── C14_Smoke_Right
│ │ ├── C15_Smoke_Mouth
│ │ ├── C16_Eat_Left
│ │ ├── C17_Eat_Right
│ │ ├── C18_Operate_Radio
│ │ ├── C19_Operate_GPS
│ │ ├── C1_Drive_Safe
│ │ ├── C20_Reach_Behind
│ │ ├── C21_Leave_Steering_Wheel
│ │ ├── C22_Talk_to_Passenger
│ │ ├── C2_Sleep
│ │ ├── C3_Yawning
│ │ ├── C4_Talk_Left
│ │ ├── C5_Talk_Right
│ │ ├── C6_Text_Left
│ │ ├── C7_Text_Right
│ │ ├── C8_Make_Up
│ │ └── C9_Look_Left
│ ├── Cam3
│ │ ├── C10_Look_Right
│ │ ├── C11_Look_Up
│ │ ├── C12_Look_Down
│ │ ├── C13_Smoke_Left
│ │ ├── C14_Smoke_Right
│ │ ├── C15_Smoke_Mouth
│ │ ├── C16_Eat_Left
│ │ ├── C17_Eat_Right
│ │ ├── C18_Operate_Radio
│ │ ├── C19_Operate_GPS
│ │ ├── C1_Drive_Safe
│ │ ├── C20_Reach_Behind
│ │ ├── C21_Leave_Steering_Wheel
│ │ ├── C22_Talk_to_Passenger
│ │ ├── C2_Sleep
│ │ ├── C3_Yawning
│ │ ├── C4_Talk_Left
│ │ ├── C5_Talk_Right
│ │ ├── C6_Text_Left
│ │ ├── C7_Text_Right
│ │ ├── C8_Make_Up
│ │ └── C9_Look_Left
│ └── Cam4
│     ├── C10_Look_Right
│     ├── C11_Look_Up
│     ├── C12_Look_Down
│     ├── C13_Smoke_Left
│     ├── C14_Smoke_Right
│     ├── C15_Smoke_Mouth
│     ├── C16_Eat_Left
│     ├── C17_Eat_Right
│     ├── C18_Operate_Radio
│     ├── C19_Operate_GPS
│     ├── C1_Drive_Safe
│     ├── C20_Reach_Behind
│     ├── C21_Leave_Steering_Wheel
│     ├── C22_Talk_to_Passenger
│     ├── C2_Sleep
│     ├── C3_Yawning
│     ├── C4_Talk_Left
│     ├── C5_Talk_Right
│     ├── C6_Text_Left
│     ├── C7_Text_Right
│     ├── C8_Make_Up
│     └── C9_Look_Left
└── Night
```


## Dataset Storage

```bash
# du -h --max-depth=1 | sort -h
9.6M  ./100-driver-samples
107M  ./data-splits
127M  ./100-driver-labels
56G ./Night
90G ./Day
146G  .
```


```bash
# du -h --max-depth=2 | sort -h
20K ./100-driver-labels/100-driver-nights-sample
9.6M  ./100-driver-samples
11M ./100-driver-labels/100-driver-nights
19M ./data-splits/Cross-modality-setting
22M ./data-splits/Traditional-setting
33M ./data-splits/Cross-vehicle-setting
34M ./data-splits/Cross-camera-setting
107M  ./data-splits
117M  ./100-driver-labels/100-driver-day
127M  ./100-driver-labels
19G ./Day/Cam2
21G ./Day/Cam3
23G ./Day/Cam1
30G ./Day/Cam4
56G ./Night
90G ./Day
146G  .
```
