# Fork-Merge Decoding (FMD) for VideoLLaMA2

This repository is built on the open-source [VideoLLaMA2](https://github.com/DAMO-NLP-SG/VideoLLaMA2) model and implements our proposed **Fork-Merge Decoding (FMD)** method.

## Key Modifications

The main architectural changes for FMD are implemented in:

1. **`videollama2/__init__.py`**  
   - Calls the model’s forward function to perform next-token generation under the FMD setup.

2. **`videollama2/model/qwen.py`**  
   - Implements the core fork and merge processes used during inference.

These files are the primary points of modification enabling Fork-Merge Decoding.

## Alternative Implementation

For simplicity, this version uses modifications in `__init__.py`.  
An alternative version integrates FMD directly into `transformer/utils.py`.
