Authorless AI-assisted productions: Recent developments impacting their protection in the European Union

  1. LL.M. Marta Duque Lizarralde
  2. M.Sc. Christofer Meinecke


The question of whether AI-generated works can be protected by copyright has become a hot topic over the last few years. However, “AI-generated works”, at least as currently defined in some policy and legal texts, do not exist. This article seeks to explain how machine learning and natural language processing, which are two subfields of Artificial Intelligence, are used in the creative process. It then outlines the obstacles that works created with AI face in order to be classified as protectable subject matter. After that, it briefly analyses whether such works can be protected by existing related rights and concludes by discussing the arguments put forward in the academic literature in favour of the creation of a new exclusive right to encourage investment in “creative AI”.


1. Introduction*


In the report on intellectual property rights (IPRs) for the development of artificial intelligence (AI) technologies, published in October 2020, the European Parliament (EP) stressed that “the growing autonomisation of certain decision-making processes can give rise to technical or artistic creations. [1] Therefore, “assessing all IPRs in the light of these developments must be a priority for this area of EU law. [2] Such assessment is likely to address, amongst others, whether AI-generated outputs can be protected by IPRs. Should AI-generated results be protectable under IP, the next question would be whether an AI system could be recognised as the ‘author’ or the ‘inventor’ of such results. If not, it is necessary to discuss whether changes in the IP system are needed to encourage investment in AI technology. This article will be centred on the authorship claims. [3]


Although today's AI systems deliver far greater functionality and capabilities than software from the 80s [4], current discussions focus on the wrong question, that is, whether AI systems, without human intervention, are capable of creating copyrightable results. Instead, the real question should be whether creations generated with AI, where the human contribution is not of an original nature, are protectable. [5]


This article aims to explain what is the role of AI in the creative process and the main obstacle against AI creations’ eligibility for copyright protection, i.e., meeting the requirement of originality. It also discusses briefly why some states’ regulations on this issue do not address it satisfactorily. Next, it analyses whether such creations can be protected by existing related rights, or whether the creation of a new related right is needed for their protection.

1.1. B. Artificial Intelligence and the cultural industry


The current surge in AI development began in 2013. [6] Several factors triggered the boom, including the increase in ICT R&D funding, which allowed for greater availability of computing power and connectivity, the enormous production of large volumes of data, and the improvements in algorithms. [7]

1.2. Examples of Artificial Intelligence systems used in the cultural industry


Various AI systems are used in the cultural industry. The most cited project so far is ‘The Next Rembrandt’, based on 168,263 pictorial fragments from 346 of the painter’s works. To identify and classify the most common Rembrandt patterns, a facial recognition algorithm and a deep learning system were used. The result was then printed in 3D with more than 149 million pixels and in several layers to resemble an oil painting. [8] Other examples of well-known systems are ‘Flow Machines’, a system that generates melodies from a database of 13,000 roadmaps of different genres [9]; or ‘Tencent Dreamwriter’ [10], ‘Automated Insights natural language generation (NGL)’ [11], and ‘Editor’ [12], AI systems that operate in the field of ‘automated’ or ‘robojournalism’. But there are many more. For instance, platforms such as ‘Artbreeder’ [13] allow the collaborative creation of new images by modifying existing ones and combining their style using neural networks; or systems such as ‘GhostWriter’ enable the creation of books from an initial story outline.  [14]

1.3. Fundamentals on the functioning of Artificial Intelligence

1.3.1. Definition of “Artificial Intelligence”


There are different definitions of AI. For the purposes of this article, the authors will follow the World Intellectual Property Organization (WIPO) definition, according to which AI is “a discipline of computer science that is aimed at developing machines and systems that can carry out tasks considered to require human intelligence. [15] It is important to note, however, that the WIPO definition includes ‘human intelligence’, which is conflicting with the definition applied by most AI researchers, that focus rather on ‘intelligent agents’, precisely to avoid the problem of measuring ‘human intelligence’. [16] In any case, the goal of AI is to automate and accelerate the performance of an intellectual task, traditionally performed by humans, through systematisation. The tasks that AI systems can accomplish are becoming progressively more complex, but their purposes remain limited. Since current AI systems can only perform specific tasks, they belong to the category of narrow AI, but not to the category of ‘artificial general intelligence’ (AGI), which would encompass systems that can undertake any intellectual endeavour. The latter remains in the realm of science fiction. [17]

1.3.2. Machine Learning


Machine learning (ML) is the most prominent subfield of AI. It aims to create pattern-recognition models that ‘learn’ to make predictions about new data by adjusting to previous data. [18] There are three main types of ML: supervised, unsupervised, and reinforcement. In supervised learning, the system is trained with labelled data and must be able to apply this knowledge to recognise the labels in a new dataset. This requires that the correct labels are provided in the first place. [19] On the contrary, unsupervised learning involves providing unlabelled training data samples with the goal of covering the hidden structure underlying the data. [20] The quality and size of the training dataset are crucial in the success of both learning processes. [21]


One example of unsupervised learning is ‘generative modelling’. Generative modelling has become more prominent recently, as two deep learning (DL) [22] techniques called ‘variational autoencoders’ and ‘generative adversarial networks’, enabled major breakthroughs in terms of creative content creation. [23] It must be recalled, however, that there is nothing magical in the functioning of creative AI systems. These systems simply perform mathematical operations, previously programmed, to learn a latent space from the data they are trained on. The latent space can be defined as “an abstract multi-dimensional space that encodes a meaningful internal representation of externally observed events. [24] In this space, similar data entries are placed close to each other and, by sampling it, these systems produce new works with similar characteristics. [25]


For example, a Variational Autoencoder (VAE) is a combination of an encoder and a decoder network that learns a general encoding from an unlabelled dataset. The encoder maps the input data to a latent space and the decoder tries to map the representation in the latent space back to the input data. The VAE learns a continuous latent space from the input data, which is achieved by creating two encodings by the encoder based on their mean and the standard deviation. This leads to different encodings for the same input data. Through this, the decoder learns for a specific input sample to refer to an area in the space instead of a single point. Further, the training process minimizes the differences between the areas of different training samples in the latent space in order to allow arithmetic on them to generate new features, e.g., adding an accessory to a person in an image, or combining faces of celebrities. [26]


Generative adversarial networks, in turn, are a set of algorithms that aim to make two neural networks compete to learn and evolve. Both networks are trained with the same dataset, but the first generating network must create variations of the data and produce a creative result that looks genuine. This output will be analysed by a second discriminative network to determine if it is part of the original training dataset or a fake output. Depending on its quality, the discriminative network will give it a score on a scale of 0 to 1. If the score is low, the generative network corrects the result and forwards it to the discriminative network. The generative networks then repeat the cycle until they create high-scoring results. In this way, images and sounds with a high degree of realism [27], or even level for video games, are produced. [28]


Lastly, in reinforcement learning, the system must achieve a certain goal and receives penalties or rewards for its performance, the goal being to maximise the total reward. [29] It has been an area of great success in training AI systems for playing games, as illustrated by the example of AlphaGo defeating a professional human Go player. [30]

1.3.3. Natural Language Processing


Another subset of AI worth mentioning is Natural Language Processing (NLP), which is used, among other things, for machine translation, text summarisation and the creation of texts, which can be short, as in the case of answers in chatbots; but also longer, as in the case of passages in articles and reports on events. [31] NLP is an area that, as its name suggests, deals with processing natural languages. This processing entails the translation of natural language into numerical data that a computer can use to learn. [32] NLP relies on unstructured data, which can be more challenging to interpret. [33] But structured data like semantic lexicons, or linguistic rules can be applied to induce domain knowledge into a model, e.g., word relations. [34] Processing the text consists of several stages. First, the text is converted into a format that computers can process. To do this, several steps must be taken. In the first place, the text is analysed and divided into several pieces, which is called tokenisation. Subsequently, the text is normalised, which means converted to be easier to process, for example by removing punctuation marks or contractions. The next step would be to remove affixes and suffixes, known as stemming, and to reduce a word to its base form to group the different existing forms of the same word, that is, to lemmatise. The system must then understand the overall meaning of the text. For this, there are different techniques, and DL is frequently employed. As a result of the process, the system must be able to discover hidden structures in sets of texts or documents. [35]


The development of AI “creative” systems requires significant investment. With the aim of protecting and recovering this investment, it has been proposed to protect the results generated with AI through exclusive rights. The first question in this regard is whether these creative outputs would be eligible for copyright protection.

2. Copyright

2.1. Protectable subject-matter


The object of copyright protection is the work, which is the formal expression of an idea or feeling communicated to the public. The work is an immaterial good, so the object of protection is the form, the expression, but not its tangible medium or the ideas it comprises. [36]


For a work to be eligible for copyright protection, it must be original. [37] There is no rule at international or EU level defining what is meant by originality. At the EU level, however, the Court of Justice of the European Union (CJEU) has specified that a work is original if it is “the author’s own intellectual creation”, which "is manifested by the author's free and creative choices. [38] This requires the existence of a field of choice, which means the requirement of originality is not met when the result is dictated by technical considerations, rules, or other subject-matter constraints which leave no room for creative freedom. [39] In addition to this, although not explicitly stated, it follows from the case law of the CJEU, the provisions of the Berne Convention [40], and some of the EU copyright directives, [41] that the author must be a natural person.

2.2. Demystifying the role of Artificial Intelligence in the creative process


Following the academic debate, a distinction must be made here between AI-assisted works and AI-generated works. According to WIPO, ‘AI-assisted works’ are those “that are generated with material human intervention and/or direction” [42], while ‘AI-generated works’ “refers to the generation of an output by AI without human intervention. In this scenario, AI can change its behaviour during operation to respond to unanticipated information or events. [43] Nonetheless, these definitions do not reflect the state of the current technology, since AI systems are still not capable of producing results autonomously, i.e., without any sort of human intervention.


In ML development, human involvement is needed in distinct phases and has a significant impact on the results. First, the training data is chosen and pre-processed by practitioners. This may include actions that require domain knowledge, for example, to exclude specific information or samples from the data that could impair the training. In the case of supervised learning approaches, the labelling of the data must also be performed by professionals with expertise in the field, although this task can be supported by an ML algorithm in a human-in-the-loop process. [44] Before training the ML model, programmers set the hyperparameters, which are those parameters that do not change during training. The first step in this regard is to design the architecture of the model, i.e., its structure. Each model is suitable for different sets of tasks, so establishing the architecture also requires expertise. [45] Subsequently, practitioners also decide on the learning rate and the algorithms used for the optimisation and regularisation of the trainable parameters of the model. Trainable parameters, unlike hyperparameters, are adjusted to better fit the data as the training dataset is analysed. To assess whether training the model is successful, a loss/cost function must be established beforehand as well. After training, decisions such as output and model selection further influence the final results.  [46] It is important to keep in mind that at each step of the human intervention a bias is induced in the model in addition to the bias already present in the original data. It is also relevant to clarify that all these steps are not performed by the same person, but rather multiple actors are involved. Moreover, once the model has been trained, it can be applied by users completely unrelated to the training process.


In NLP, a subfield of particular relevance to our analysis is Natural Language Generation (NLG), which deals with the processing of unstructured data into human-readable text. The process of automated text generation entails various stages. First of all, as data often comprises more information than needed, the content to be produced must be delimited (content determination); then the data structures are arranged to create the narrative structure and the documentation plan (document/discoursing planning). Next, data are analysed and contextualised, often using ML (data interpretation). This involves the selection of phrases and words to express the domain-specific concepts and relationships in the texts (referring expression generation and lexicalisation). Subsequently, it must be ensured that the entire text adheres to the correct grammatical form, spelling, and punctuation (grammaticalization/linguistic realisation). And finally, the data is entered into the appropriate templates to check that the output is correctly formatted (language implementation). Human involvement in this process remains significant, although a number of tools exist that are useful for automating individual steps. [47]

2.3. “AI-assisted works” vs. “Authorless AI-assisted” productions


From what has been discussed so far, we can conclude that human intervention in the different phases that predetermine the outcome is still relevant. Consequently, the creations that are called ‘AI-generated’ are in fact ‘AI-assisted’. In many works the human contribution to the final result is not only relevant, but also original, and therefore copyrightable. [48] This would be the case, for example, of ‘The Next Rembrandt’. [49] However, there are some outputs, such as initial translations performed by DeepL [50], in which the human input may not be of an original nature, although the results are still linked to pre-existing data and parameters provided by the AI developers. Then, they are not copyrightable. [51] Nonetheless, these outputs are not ‘AI-generated’ in the terminology used by WIPO, and two more accurate terms for this type of existing creations that do not deserve copyright protection are those of ‘Authorless AI-assisted productions’ and “Authorless AI outputs”, adopted in the ‘Trends and developments in AI final report’. [52] This report explains that there are three stages in the creative process of a work, namely conception, execution, and redaction. It also indicates that even if automated translators, such as DeepL, generate nearly usable results, some human intervention by the user in the redaction phase is still needed to turn the outputs into workable translations. Thus, if a natural person, based on the initial translation, which would not be protectable, makes further modifications, such as rephrasing words and changing the order of parts of the text, the result may be eligible for copyright protection. [53]


In the same vein, Trapova and Mezei argue that when NLG is employed in the field of robojournalism, at least in the phases of discourse planning and lexicalisation there is room for expressing the free and creative choices of individuals. Hence, the resulting outcomes may be protected. [54] Nevertheless, as these authors correctly observe, there are reports that, even if written by individuals, would not merit protection because the requirements regarding their presentation leave no margin for “originality”. [55] In these cases, it makes no difference whether or not AI has been used to produce the text.


In short, to determine whether a result generated with AI is copyrightable, its creation process must be examined. There is no general rule but depending on the steps required to develop a particular project, as well as its domain of application, the type of human involvement in the different stages may or not be of an original nature. Therefore, on a case-by-case basis, there may be one person, several, or none at all who qualifies as the author.


This idea is developed by Deltorn and Macrez in their analysis of the role of AI (especially DL) and authorship claims in the music industry. [56] In line with the previous discussion in this section, these authors point out that the functioning of DL systems relies on a series of human decisions made before, during and after the training of the model. The more difficult question then becomes whether there is an author according to the role of the different actors in the generative process, as well as the interactions between humans and the generative model in question. [57] When creating music compositions with AI, there is space to shape the output either by selecting the training dataset; by modifying the model parameters while interacting with it; or by iteratively guiding the selection of the output through the selection of various parameters, as in the case of ‘Flowmachines’. [58] But the fact that this space exists does not mean that ‘free and creative choices’ are always expressed. As this depends on the specific case, the question of whether works created with AI are copyrightable has lawyers frequently answering: “it depends”.

2.4. Existing legislation on “computer-generated works”


Yet, some legal systems (Ireland, the UK, New Zealand, South Africa, India and Hong Kong) have special rules for ‘computer-generated works’, described as those generated where “there is no human author” [59] or “the author is not an individual.” [60] Through a legal fiction, they grant the copyright of these works to “the person by whom the arrangements necessary for the creation of the work are undertaken” [61] or “the person who causes the work to be created. [62] While some advocate that this model is the best available, and should be adopted in more jurisdictions, [63] the issue is not satisfactorily addressed. A regulation that allows copyright to be granted to different persons on a case-by-case basis provides the necessary flexibility in this context. However, the vagueness of the terms “making the necessary arrangements” or “carrying out the creation of the work” is a point of criticism, as they are unclear as to what specific actions would be required to obtain copyright, thus requiring further interpretation. [64] Furthermore, these regulations classify as a ‘work’ a creation whose creative process is not original, and therefore must not be protected. [65] In fact, protecting `Authorless productions´ by copyright is not optimal. [66]

2.5. Possible ways forward


Some have suggested a reinterpretation of the concept of originality to protect such creations as long as they meet a certain degree of creative level and novelty. [67] The European Parliament, in the above-mentioned report, has also proposed an assessment of the advisability “of granting copyright to such a creative work to the natural person who prepares and publishes it lawfully, provided that the designer(s) of the underlying technology has/have not opposed to such use. [68] Nevertheless, this would contradict not only the current prevalent opinion in the academic community [69], but also the contemporary conception of copyright in the EU. The latter statement is particularly relevant considering that the CJEU in the Levola v. Smilde case reiterated the above-mentioned subjective criteria for assessing originality and ruled that the concept of a work “must normally be given an autonomous and uniform interpretation throughout the European Union. [70]


The European Commission (EC) has also addressed the topic in the Communication "Making the most of the EU’s innovative potential. An intellectual property action plan to support the EU’s recovery and resilience,” published on 25 November 2020. [71] The EC followed the conclusions of the above-mentioned "Trends and developments in AI final report" and acknowledged thatcreations autonomously created by AI technologies are still mostly a matter for the future”, concluding that “AI systems should not be treated as authors”. It also affirms that “the EU IP framework appears broadly suitable to address the challenges raised by AI-assisted creations, but maintains that there are gaps in harmonisation and margin for improvement, so dialogue with stakeholders is needed. [72]

3. Related Rights

3.1. Protection granted by existing related rights


Some have argued that authorless creations could be protected by certain related rights, such as the rights of phonogram producers [73], film producers [74], broadcasting organisations [75], publishers of press publications [76], and non-original photographs. [77] The reason is that these rights do not require originality or human authorship. [78] However, others claim that these rights are likewise conceived for human beings, and that legislative reform would be necessary to adapt their ownership. [79] In addition, it is also maintained that in most cases, authorless creations do not meet the requirements for protection set by the related rights under which they are purported to be protected. [80]


More controversial is the question of whether authorless AI-assisted databases are protectable by the sui generis database right. For a database to be protected by this right, there must be substantial investment, quantitative or qualitative, either in obtaining, verifying, or presenting the content of the database. [81] Conversely, investment in the creation of data does not lead to protection. [82] In some cases it may be very cumbersome to determine whether the cost incurred by a legal database producer in developing and applying AI technology amounts to a substantial investment in data creation or collection. Even assuming that in this case the substantial investment is made in the collection of existing data, it might not be desirable for AI-generated data to be protected by the sui generis right. It has rightly been pointed out that in such a rapidly changing context, where new databases are constantly being produced, the risk is that protection may become perpetual, which could lead to anti-competitive effects. [83] Nevertheless, when AI is used to verify or present existing data, the result may be protected by the sui generis database right. [84]


Further research on this topic is indeed needed. What seems certain, however, is that those authorless creations that do not come within the scope of the existing related rights are unprotected and would fall into the public domain. [85] That said, the idea of authorless creation falling into the public domain is rejected by part of the academic community, and the introduction of a new related right is instead suggested. [86]

3.2. Creation of a new related right


Yet, the creation of a new related right may not be the best approach. Up to date there is neither economic nor theoretical justification (e.g., deontological or naturalistic), supporting that this related right would incentivise the creation of authorless AI-assisted productions, instead of producing saturation in the market. [87] What’s more, it seems that while most jurisdictions do not have copyright or other exclusive rights to protect these productions, the development of AI, including creative AI, is in full swing. [88] Moreover, regardless of the protection of the results created by AI, those who use it as a tool to create content can benefit from first mover advantages. [89] Finally, sufficient tools are already available to those who employ creative AI systems to protect their results, such as trade secrets, factual control, and unfair competition. [90] Rather than initially envisaging the creation of new exclusive rights, consideration should be given to the potential of these tools to provide adequate protection, and to whether further harmonisation, for example in the area of unfair competition, would be desirable.

4. Conclusions


In recent years, the debate on how to protect `AI-generated works´ has become a hot topic. However, it should also be noted that nowadays AI systems belong to the category of narrow AI, as they can only perform specific tasks, and artificial general intelligence (AGI) is still science fiction. As highlighted by François Chollet, creator of Keras [91], “AI isn´t anywhere close to rivalling human screenwriters, painters, and composers. But replacing humans was always beside the point: AI isn’t about replacing our own intelligence with something else, it is about bringing into our lives and work more intelligence, intelligence of a different kind. In many fields, but especially in the creative ones, AI will be used by humans as a tool to augment their own capabilities, more augmented intelligence than artificial intelligence”. [92]


Many so-called ‘AI-generated works’ are actually ‘AI-assisted works’, in which human involvement in various stages of their creation remains relevant and original. Therefore, they do not raise concerns in terms of copyright protection. AI systems cannot generate works autonomously, without any human intervention. Hence, the discussion should focus on how, and whether it is desirable, to protect those AI productions in which a natural person's contribution to the final result is not original.


Definitions of `AI-generated works´, such as the one adopted by WIPO, do not reflect the current state of AI technology. Hence, a first step to progress in this debate is to strengthen the dialogue between the technical and legal sectors, and thus create a win-win situation for all. On the one hand, AI developers must have a proper IP strategy that allows them to make profits. On the other hand, those in the legal world must understand the technology and the market in order to advise on and regulate it, based on factually correct premises.


Copyright is not a suitable means for protecting authorless results. This is because they cannot meet the subjective criterion used by the CJEU in examining originality, nor the requirement that the author must be human, which is presupposed in some provisions of the Berne Convention and in some European directives.


Although some argue that authorless creations could be protected by certain related rights, further research is needed on this issue. In any case, introducing a new related right to protect authorless creations is not be the best solution. Those using creative AI systems may already have sufficient tools to protect their results.



*by Marta Duque Lizarralde, LL.M., is Research Associate, Doctoral candidate at the Technical University in Munich, Germany and Christofer Meinecke, M.Sc., is Research Assistant, Doctoral candidate at Leipzig University, Germany


JIPITEC – Journal of Intellectual Property, Information Technology and E-Commerce Law
