Dear Reviewer,

Thank you for taking the time to review our manuscript.

To support reproducibility and transparency, we have attached the following supplementary files:


qualitative_results/ – This folder includes visualizations comparing features obtained from standard feedforward passes using a teacher model (e.g., EoMT teacher) versus those obtained via our proposed layer-wise querying approach. These examples aim to illustrate how depth can be treated as a spatial coordinate, enabling constant-time feature selection across layers.

sample_segmentation_masks/ – This contains segmentation outputs produced by our 3DLoc-binded variant of the model.

references/ – A collection of foundational papers that have informed the development of our work.

Please note that our codebase will be released publicly on Github post-acceptance.

Thank you once again for your time and constructive feedback.

Sincerely,
Authors of Paper 6304
“Layer Query Network For Test-Time-Training in Vision-Language-Models”
