# MAACA (Multimodal Aspect-Aware Complaint Analysis)

# Installation

```python
conda create -n maaca python=3.9
conda activate maaca
cd maaca/
pip install -r requirements.txt
```

# Pretraining

```python
CUDA_VISIBLE_DEVICES=0 python pretrain.py \ 
--run_name=maaca_pretrain \
--audio_transform_hidden_dim=768 \
--video_transform_hidden_dim=768 \
--audio_transform_num_layers=2 \
--video_transform_num_layers=2 \
--audio_output_seq_len=128 \
--video_output_seq_len=128 \
--audio_transform_output_dim=768 \
--video_transform_output_dim=768 \
--fusion_output_dim=768 \
--linear_layer_hidden_dim=64 \
--add_pooling=False \
--num_classes=7 \
--max_txt_len=128
```

# Finetuning

To finetune the model  initialized with the pretrained weights obtained from the run above, modify the “`finetuned`” field under `src/config/models/product2.yaml` with the absolute path of the pretrained model. Also, and ensure “`load_finetuned`” is set to `True`

```python
CUDA_VISIBLE_DEVICES=0 python train.py \ 
--run_name=maaca_pretrain \
--audio_transform_hidden_dim=768 \
--video_transform_hidden_dim=768 \
--audio_transform_num_layers=2 \
--video_transform_num_layers=2 \
--audio_output_seq_len=128 \
--video_output_seq_len=128 \
--audio_transform_output_dim=768 \
--video_transform_output_dim=768 \
--fusion_output_dim=768 \
--linear_layer_hidden_dim=64 \
--add_pooling=False \
--num_classes=7 \
--max_txt_len=128
```

# Note

- predicted_time_intervals_sample.csv contains the extracted timestamp for the sample data clips using CGDETR using the process mentioned in the paper
- data_train_sample.csv and data_test_sample.csv are the sample from the VCD dataset
- video/ and audio/ contains the downloaded the video and audio of dataset