Abstract: We introduce a form of steganography in the domain of machine learning which we call training set camouflage. Imagine Alice has a
training set on an illicit machine learning classification task. Alice wants
Bob (a machine learning system) to learn the task. However, sending
either the training set or the trained model to Bob can raise suspicion if
the communication is monitored. Training set camouflage allows Alice to
compute a second training set on a completely different – and seemingly
benign – classification task. By construction, sending the second training set will not raise suspicion. When Bob applies his standard (public)
learning algorithm to the second training set, he approximately recovers
the classifier on the original task. Training set camouflage is a novel form
of steganography in machine learning. We formulate training set camouflage as a combinatorial bilevel optimization problem and propose solvers
based on nonlinear programming and local search. Experiments on real
classification tasks demonstrate the feasibility of such camouflage.
0 Replies
Loading