## Anonymous code for Submission 21793 (Neurips 2025)

We use the "Bias in Bios" dataset (https://arxiv.org/pdf/1901.09451) to show the applicability of our methods on real world scenarios. "Bias in Bios" is a dataset with multiple professions (classes) and genders. The aim is to reduce the True positive rate gap between males and females in each profession.

We use the setup of "Representation Surgery: Theory and Practice of Affine Steering Singh et al., ICML 2024" (Code: https://github.com/shauli-ravfogel/affine-steering?tab=readme-ov-file). We simply plug in our intervention names "Affirmative" and compare with their proposed methods (Mean and Covariance Matching) and LEACE (Least Squares Concept Erasure) which was used as a baseline in Singh et al. 

"multi_class.ipynb" contains the code to check and reproduce our claims, where we simply plug our intervention function ("Affirmative") with the pipeline implemented by Singh et al. "result_plot.png" shows the profession wise TPR difference before and after interventions, where we show that our "Affirmative" intervention almost always reduces the TPR gap and sometimes matches the performance of "Mean + Covariance Matching" matching proposed by Singh et al.

Some Additional Notes:
1. We fix a class for affirmative action by taking the class with smallest TPR gap. The code allows to plug in any function to choose the fixed class.
2. We use the representation pickle files for train, val and test provided by the authors (Singh et al.)
3. "debug_data/" folder is where we train and store classifiers in pickle file for faster running of the code. For the first time, the user must uncomment mlp classifier training code lines and store them in "debug_data" folder.
4. Any helper file missing is taken directly from the code repo of Singh et al. (https://github.com/shauli-ravfogel/affine-steering?tab=readme-ov-file).

We also use the code in emotion_steering.ipynb to run the steering vector experiment. We provide the files used to train steering vectors (after converting to representations). The code can be re-run to generate all the helper files required. 