We share our implementation of EXS and Rank-LIME, extending LIME here. We use the lime code base for the same, with custom modifications.

We used the KernelSHAP code by Lundberg and Lee for implementing RankSHAP (https://shap-lrjball.readthedocs.io/en/latest/generated/shap.KernelExplainer.html) 

NDCG score : https://scikit-learn.org/stable/modules/generated/sklearn.metrics.ndcg_score.html

We're removing identifiable data from the user study. For that reason we're not uploading it completely. Kindly refer to the technical appendix for User Study instructions and snapshots.

Datasets :
MS MARCO : https://github.com/microsoft/MSMARCO-Passage-Ranking
ROBUST04: https://huggingface.co/datasets/irds/trec-robust04

Ranking Models Used :
1. https://github.com/nyu-dl/dl4marco-bert (MS MARCO + BERT)
2. https://github.com/nyu-dl/dl4marco-bert (MS MARCO + T5)
3. https://huggingface.co/castorini/rankllama-v1-7b-lora-passage/tree/main (MS MARCO + LLAMA2)
4. https://arxiv.org/pdf/2003.06713 (ROBUST04 + T5)
5. https://arxiv.org/pdf/2003.06713 (ROBUST04 + T5)
6. https://pypi.org/project/rank-bm25/ (BM25 + Both Datasets)