Schalast | Copyright and AI
1. Copyright and AI
Copyright law, which protects “works in the literary, scientific and artistic domain” (Section 1 Copyright Act), was something that took place in the analog world until the end of the twentieth century. Then, the central tenet of the system – the author’s exclusive right to work reproduction and distribution – was hit by digitalization. The earlier forms of reproduction, such as the re-carving of sculptures, clandestine concert recordings on the cassette recorder or the photomechanical office copier, now appear so antiquated in comparison to today’s methods of replication.
For instance, in this environment, the initial wave of music file-sharing networks was a momentous change. In response to the changes brought about by digitalization, copyright law has undergone considerable revisions while retaining its fundamental frameworks.
No less severe challenges are being posed to copyright law today by AI. This raises some interesting new considerations with regard to who, if anyone, enjoys protection for AI-generated content and who, conversely, may freely use intellectual material that they did not create without asking or paying.
Images of Donald Trump cleaning his prison cell, a song supposedly performed by a popular musician (who never recorded it), or the continuation of a TV show script without an author – AI makes all of this feasible, and much more.
To put it simply, AI is the ability of a machine to perform cognitive tasks that normally need human intellect such as creativity, logical thinking, planning or learning. “Generative” AI creates novel content, such as new writings, software programs, musical compositions, or comic book artwork. Unfathomable amounts of copies are made from the original sources, especially on the internet. Collecting, extracting, processing, modeling, and interpreting (“training”) the AI with respect to the aims and parameters of the AI in question requires “data” (which may include the most significant literary and creative works). The replicated material is digitally processed in a way that allows a subsequent “remix” of the material needed, which might provide comparable but unique outcomes, depending on the task. As a rule, the source material will contain masses of works protected by copyright or ancillary IP rights.
2. Input: Text and data mining
Text and data mining are the primary methods of information gathering. The law defines this as “any automated analytical technique aimed at analyzing text and data in digital form in order to generate information which includes but is not limited to patterns, trends and correlations” (Article 2(2) Directive (EU) 2019/790; in Germany, Section 44b Copyright Act).
For this reason, reproductions of publicly available works are declared permissible, albeit with minor reservations. For example, the rightholder may declare a reservation of use, which needs to be machine-readable, in the case of his works being accessible online. Section 60d Copyright Act contains special provisions for purposes of scientific research.
Because mining of information that is not or is no longer protected does not need an exemption, the term “work” is to be understood in the technical sense (cf. Section 2(2) Copyright Act: “Only the author’s own intellectual creations constitute works within the meaning of this Act.”). These two copyright limitations provisions are declared applicable – like all others – in the individual provisions concerning ancillary IPs (for example, photographs as opposed to photographic works, sound recordings or broadcasts as opposed to the recorded or broadcast content).
As of 2001, the EU permitted transitory, ephemeral copying operations that are integral to a technological process, and which are indispensable to digital technologies (Section 44a Copyright Act); this trend is continued by Sections 44b and 60d Copyright Act. AI, however, often necessitates the creation of a stable reference corpus in which data is kept for further analysis.
It is now possible to digitize any work, whether it was previously only available in analog form or is already available online, for the sake of text and data mining. In the classic categories of copyright law, protected components of the mined material generally are “reproduced” (Section 16 Copyright Act) and “altered,” which the author normally also could prohibit. Due to a special provision in Section 23(3) Copyright Act, however, the prohibition of alteration is declared inapplicable in the context of text and data mining. Nor is there an obligation to credit the authorship or to pay remuneration.
In the U.S., however, mining publicly available data has come under ethical and legal fire from authors who do not want to put up with their mined works being processed for the purpose of "training" AI systems. Instead, they would want to be asked for permission, be credited as sources, and paid for their work. They allege that AI developers covertly “steal” billions of words and images from various sources including books, blogs, articles, emails, and chats for the purpose of digesting and “regurgitating” them. They claim the entire system is parasitic and yields nothing but derivative results.
This approach departs from copyright notions in that it disregards whether or not any portion of a work, however little, may be recognizable. The fact that complainants often have no way of knowing whether or not their works have been included in the training materials is a major issue for them. In its lawsuit against Stability AI, Inc., Getty Images (US), Inc. submitted images generated by the defendant’s software on which the Getty Images trademark and “watermark,” though distorted, were still clearly discernible. This is an exception. Generally, to assert (presumed) claims, claimants will likely have to rely on information from the AI developers (where it may seem doubtful whether they even know which contributions have been incorporated in the image and text corpora created by their tools).
The EU is now working on AI legislation that would force developers of generative AI to disclose any copyrighted materials used in their projects. It remains to be seen whether these plans will be enacted, and what efforts, if any, its compliance will require.
3. The output:
a) Is the output protected, and if so, for whom?
It is natural to wonder if AI-generated output falls under Copyright Act protection in the first place. Section 2(2) Copyright Act precludes this on principle since it necessitates a “personal, intellectual creation” (in the same way that patents are only issued for solutions to technical issues that are “based on inventive activity”).
In practice, a text generator such as ChatGPT follows the same logic as the autocomplete feature in browsers, address books, and email clients by gaining “experience” through “training” and choosing the most “probable” continuation based on its findings. In contrast to an address book, AI has access to an infinitely larger amount of data and data processing capacity.
So, no matter how Impressive the end result is, it will not be protected by copyright laws if it was created in the black box exclusively by algorithms processing the raw data automatically. This may be different, however, where it is evident that a human’s creative input was vital to the final product. A portrait of two women, for which the artist was awarded the “Sony World Photography Award” in April 2023, is a prime example. The photographer’s entry in the competition was not shot with a camera but created by giving verbal instructions (or “prompts”) to a software program. Once on stage, the winner declined to accept the award. Though valuable as an eye-opener, this instance fails to address the issue of whether or not the finished product, which includes the artist’s specific instructions and personal creativity, should be protected as a new kind of artwork under the Copyright Act.
If and to the extent that human creativity has been reflected in the final product using previously established (analog or digital) procedures, copyright protection is even more relevant. The U.S. Copyright Office recognized the authorship of a comic strip by an artist who had thought up the plot of the comic strip herself and had deliberately altered the images generated by the AI afterwards using an image editing program. The unedited images, on the other hand, were unprotected since they were entirely computer-generated and, in contrast to the human drawing, were “unpredictable.”
The computer programs used in the context of AI enjoy protection as such. This, however, has nothing to do with a possible protection of the results generated by the user with the help of the software. The AI can be used more as a random generator or as an aid in the application of the human intention to design, depending on the preferences of the user, and the programs are only tools in this regard. A protection of the finished product will depend on which parts are the result of human creativity in instructing the machine with respect to the desired result, and which parts are due to machine imitation of the mined and prepared stock of pre-known works.
b) When does the output infringe on prior rights?
It is possible that images, texts, music, etc. generated by AI infringe on protected works contained in the source material, despite the fact that mining, copying, and processing of freely available works and related subject matter is permitted in the EU, especially for the purpose of “training” AI.
Copyright and related rights are protected not only against identical copies, but also against certain variations. The relevant statutory provision is Section 23(1) Copyright Act, entitled “Adaptations and Transformations,” which reads as follows:
“(1) Adaptations or other transformations of a work ... may be published or exploited only with the author’s consent. If the newly created work maintains sufficient distance to the work used, this does not constitute adaptation or transformation within the meaning of sentence 1.”
For some works, the author’s consent is needed not only for publishing or exploitation, but also for the preceding adaptation or transformation, as stated in paragraph (2) of Section 23 Copyright Act.
One of the key challenges that copyright law has to face is determining the scope of protection of a work by interpreting the second sentence of Section 23(1) Copyright Act. To deal with this, metaphoric notions have been coined such as the idea that the author has a right to object to adaptations and transformations so long as the author’s original intent “shines through” or that the original work’s features do not “fade” behind the result of the transformed version.
It is always permissible to use another’s work merely as an incentive or inspiration for independent work.
Results created by or with the assistance of AI are not exempt from these standard guidelines. It is irrelevant whether human or mechanical labor was more important in completing the task, as long as the requisite “distance to the work used” (Section 23(1) sentence 1 Copyright Act) is met.
A “tracing back” of an AI’s outcome to a specific earlier effort will be extremely difficult due to AI’s peculiarity of using and drawing from a (potentially huge) quantity of previously known material. This has a disproportionately negative impact for music creators, since melodies and chord sequences often get muddled or even destroyed in the “training” process and usually reappear beyond all recognition. This issue is growing worse as time goes on because of the exponential growth and improvement of current AI systems.
Also, the prompt - popular among users - that AI perform tasks to imitate the way or style of a particular human creator, such as writing a poem “in the manner of” a certain author or era, or painting a scene “in the style of” a particular painter, may amazingly imitate the manner or style.. Again, however, the standard rule will apply that manners or styles of artists do not enjoy protection in and of themselves, but only specific works. Literary or comic characters, on the other hand, may enjoy protection as such if they are sufficiently idiosyncratic, with the consequence that use of such characters for one’s own stories may infringe the copyright in the characters. A “prompt” of, for example, placing Asterix and Obelix in a cherry blossom landscape in front of Mount Fujiyama would probably be open to challenges (unless the barrier of Section 51a Copyright Act intervenes, according to which another’s work may be used “for the purpose of caricature, parody, and pastiche”).
4. Copyrights and other moral rights
a) Author´s moral rights
Both the author’s financial interest in his work (exploitation rights) and his emotional and intellectual investment (moral rights) are safeguarded by the law. This includes the rights to recognition of his authorship and the prohibition against any misrepresentation or denigration of that work. Obviously, the handling of AI can also lead to serious impairments in this respect - always provided that the original work remains recognizable.
b) Other moral rights
Rights easily infringed upon by the use of AI are those to privacy, the name, honor and the right to one’s own image.
In addition, German case law has developed a general right of personality based on the constitutional guarantees of human dignity and general freedom.
No copyrights, but personality rights may be infringed by the AI where it appropriates and imitates people’s voices. It takes only a short amount of time and little training to clone voices nowadays. Two U.S. rappers, for example, were more than astonished to hear themselves performing a song they neither knew nor had ever performed – let alone as a duet. That would be a blatant violation of their moral rights under German law. Of course, the audience is also being misled. Hundreds of thousands of people listened to the music in a short amount of time before the publisher of one of the artists put a stop to the streaming.