Finding Input Data Domains of Image Classification Models with Hard-Label Black-Box Access

Jiyi Zhang; Han Fang; Ee-Chien Chang

Finding Input Data Domains of Image Classification Models with Hard-Label Black-Box Access

Jiyi Zhang, Han Fang, Ee-Chien Chang

Published: 01 Jan 2024, Last Modified: 13 Nov 2024ACM Multimedia 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Understanding the correct input domain for black-box models is vital for tasks such as model cloning, inversion, and membership inference. However, this area remains underexplored, hindering related methods' efficacy without domain information. In this paper, we highlight the need for discovering the data domain and propose an approach that leverages existing generative models to address this challenge. With hard-label black-box access to a neural network model, our method produces a set of embeddings that, when utilized with the generative model, yield samples closely aligned with each target class's data domain, facilitating downstream tasks. Central to our method is an objective function covering both functional relevance and embedding generality. We employ an iterative search algorithm to identify the optimal set of embeddings. Starting with initial embeddings, new data points are generated and classified by the target model. Successful classifications guide embedding resampling, refining subsequent iterations' generated images closer to the target class's data domain. Consequently, the embeddings are iteratively modified to better match the data domain of the target class. Given the vast embedding space, we introduce an optional preprocessing phase. This phase leverages a comprehensive corpus like ImageNet to select a representative subset of samples, roughly aligned with the model's input domain, to serve as starting points.

Loading