Perspectives on Cascading Pipelines for Sensitivity-Aware Search

Published: 22 May 2026, Last Modified: 22 May 2026ICAIL 2026 Workshop on Artificial Intelligence and Open GovernmentEveryoneRevisionsCC BY 4.0
Keywords: Sensitivity-Aware Search, Open Government
Abstract: Large document collections, such as email archives and meeting minutes, are produced by governments, institutions, and companies through their day-to-day activities. Such document collections contain information that would be useful to stakeholders across different sectors, were they to be made publicly available. Indeed, in the case of government document collections, under Open Government models, citizens must be able to access documents produced by their government in a timely manner. However, such document collections can contain sensitive information such as matters of national security, which prevent the collections from being made available to the public. Sensitivity-Aware Search (SAS) proposes a solution to making document collections potentially containing sensitive information publicly accessible by enabling the entire collection to be searched whilst protecting sensitive information from being exposed. In this paper, we argue that SAS should be addressed with a cascading retrieval pipeline, where documents are ranked in a staged manner by a sequence of different models. As such, results at each stage can be inspected, and each model can specialise in different aspects of retrieval to work in tandem, considering sensitivity at each stage. We present three arguments and provide perspectives from both a system and governance perspective for each argument. Specifically, we argue that the separation of concerns, efficiency, and inspectability that cascading pipelines offer make them particularly useful for deployment in open government scenarios. Further, we provide experimental evidence to support our thesis that SAS should be tackled as a cascading pipeline.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 7
Loading