# Supplementary Material for ICML 2026 Submission

This repository contains the core implementation for the methods proposed in our ICML 2026 submission, "Language Bias in LVLMs: From In-Depth Analysis to Simple and Effective Mitigation". The provided code showcases the key modifications made to the LLaVA-v1.5 framework to implement our **Language Bias Regularization (LBR)** method.

## Overview

The code is intended to provide a clear and minimal implementation of the LBR training process. It includes the necessary architectural changes to enable the calculation of our language bias ($\mathcal{B}$) and the custom trainer that incorporates the LBR loss. These files are not standalone and are meant to be integrated into the official LLaVA-v1.5 codebase.

## Code Description

The core logic is contained in the following four files:

* **`llava/model/llava_arch.py`**
    * This file modifies the LLaVA architecture to correctly handle and mark the positions of image tokens within the input sequence. This is a foundational step for distinguishing between visual and textual inputs.

* **`llava/model/language_model/llava_llama.py`**
    * We modify the `forward` pass of the LLaMA model within LLaVA. This change enables a text-only forward pass (i.e., without image tokens), which is essential for calculating the text-only likelihood $\pi_\theta(y \mid x)$ and our language bias metric, $\mathcal{B}$.

* **`llava/train/LBR_trainer.py`**
    * This is the core of our LBR implementation. It contains a custom `LLaVA15_LBR_Trainer` class that inherits from the standard Hugging Face Trainer. It overrides the `compute_loss` method to include the calculation of the Language Bias Regularization term ($\mathcal{L}_{\text{LBR}}$) and adds it to the standard VIT loss.

* **`llava/constants.py`**
    * This file is updated with new constants used to mark the beginning and end of image token sequences, ensuring consistency with the changes in `llava_arch.py`.

## Usage

1.  Clone the official LLaVA-v1.5 repository.
2.  Replace the corresponding files in the cloned repository with the ones provided here.
3.  Follow the original LLaVA-v1.5 instructions for setting up the environment and data.
4.  When running the training script, use `LBRTrainer` as the designated trainer. Ensure the regularization hyperparameter `alpha` is set.

The full codebase, including scripts for LBP and all experiments, will be released upon the paper's acceptance.
