Copyright Throughout a Creative AI Pipeline

Sancho McCann

29 Aug 2023OpenReview Archive Direct UploadReaders: Everyone

Abstract: Consider the following fact pattern. Alex paints some original works on canvas and posts photos of them online. Becca downloads those images and uses them to train an AI (training configures the AI’s model parameters to useful values). Becca posts the resulting trained parameter values on her website under a license that reserves to Becca the right to use the parameters commercially. Cory uses those parameter values in a program that is designed to produce artwork. Cory clicks create and the program produces a work. This work is new to Cory, but it looks a lot like one of Alex’s original canvas images. Cory sells the work. Advise Cory about their potential copyright liability to Alex (for the substantially similar work that the program produced and that Cory subsequently sold) and to Becca (for taking Becca’s parameters and using them commercially, contrary to the license). Cory clicks create again. The program produces another work, this time quite different from any of Alex’s original paintings. Cory shares new work on Instagram. Danny copies this image from Cory’s Instagram feed and sells a bunch of postcards that feature that image. Advise Danny about their copyright liability to Cory. These scenarios are not as contrived as they might initially seem. People frequently use copyrighted works when training an AI (more precisely: when training an AI’s parameters). The resulting trained parameters are being shared under licences that assume the parameters are the subject of copyright. People do use these parameters in programs that can produce novel content. The resulting work can be quite surprising to the end-user and there are generally no checks in place to ensure that the new works do not take too directly from the original training data. However, many of the new works will be quite different from any content already in the world. And the end-users of the creative program often claim copyright ownership over the resulting novel work. These are real issues—a fact reflected in the explosion of articles and international attention being devoted to the topic. I will first present the training and use of a creative program based on a neural network, a popular model that forms the basis of state-of-the-art creative AIs. Then, I will examine each of the issues just raised ... On the first and second issues, I conclude that under current Canadian copyright law, it will almost always be the case that nobody will hold the copyright to the algorithmically trained parameters. However, works produced by using these trained neural networks will often, but not always, attract copyright protection. This distinction is normatively justified, on the basis that the purpose of copyright in Canada is to provide balanced protection of an author’s expression of skill and judgment and because there are technological means to keep one’s trained parameters secret even while allowing others access for use. Third, I conclude, in agreement with the Statutory Review of the Copyright Act, that copying existing works for the purpose of building a training set is prima facie infringement and that Canada should clarify that this is a purpose allowed under Canada’s fair-dealing user’s right. To include this purpose under the fair-dealing user’s right would avoid chilling educational and research activity that is dependent on large collections of example work. Finally, I conclude that it is open for the output of a creative AI to be an infringing work and that the burden is properly on the trainer to ensure they have not created an infringement machine. This is exactly where that burden should be placed, as the trainer has more information and is the least-cost avoider.

0 Replies