# SaMyNa Project

Full supplementary materials for the submission to TMLR.

## General Informations

This is the full supplementary materials supporting our TMLR submission.
It contains:

+ keywords_ablation_tsim.txt: text file containing the full list of output keywords partially displayed in Section C.2.4 of the Appendix. Attached here due to its length that would have made cumbersome the reading from the pdf.
+ captions: the full corpora of obtained captions, for the reviewer to be consulted to get a grasp of their quality without having to generate them from scratch, which is time consuming
+ waterbirds_no_bias_toy_example: synthetic experiment in which we test an "unbiased" selection on waterbirds, see Section 5.2 of the main paper.
+ bias-mining-debiasing.zip: source code for the bias mining step (section 3.1 of the main paper) and for bias mitigation (Section 4.3 of the main paper) 
+ naming-biases.zip: source code for the bias naming step (section 3.2 of the main paper)
+ captions/b2t_celeba_clipcap: here are all the captions generated by ClipCap captioning model used in B2T on the CelebA dataset, we use these captions to show that they are not suitable to detect biases other than "man". These captions are used for table 6, 7, and 8 in SaMyNa's Appendix.
+ keywords_ablation_all_hyperparams.txt: text file containing the full list of output keywords displayed in Section C.2.5 of the Appendix. Attached here due to its length that would have made cumbersome the reading from the pdf.


## Reproducing our results

In order to replicate our pipeline, you must first setup the "bias-mining-debiasing" project, following the dedicated readme you will find inside the archived zip. 
Once you will have successfully set up "bias-mining-debiasing", it will be necessary to setup "naming-biases". This comes with its own readme, and will have higher computational requirements.
Also keep in mind that we were not able to include the validation set of ImageNet-1K in our submission, therefore you will need to obtain it through the available sources and follow the dedicated instructions insisde the "bias-mining-debiasing" README. All the other datasets are automatically downloaded and should work out of the box.
The Bias Mitigation results (Section 4.3 of the main paper) can be run independently, see "bias-naming-debiasing" README.
