
Generative AI and Copyright laws: analysis of the role of law and boundaries set on AI with the pre-existing laws.
Last Updated on April 16, 2025 by Athi Venkatesh
Generative AI ( Artificial Intelligence) and copyright laws are an ever-evolving subject matter issue, for which the laws have been demanded to update themselves to not act as a bar or a constraint to the realm of Generative AI. This blog finds itself to answer certain questions that are directly and substantially in question to the developers and users of Generative AI.
Keywords: Generative AI, pre-curation, datasets, public domain, tweaked, post-curation.
Understanding the process of how AI generates art and artistic work:
The process of AI has several layers in it and it being disintegrated into significant processes would be helpful to determine the issues that would arise from the process of generative art produced by the AI. The process initially starts with the input from the user after which the AI does the process of pre-curation of the datasets it has collected, the datasets are analyzed and determined whether they are related, significant, or similar to that of the input given. The probability of copyright infringement arising at this point is quite significant. Let’s consider the situation where the user gives input for the AI to generate a picture of a black mouse in red pants and big ears with a white duck in a blue sailor’s uniform meets a private detective with a long trench coat, smoking a pipe.
The datasets collected for the mouse and the duck here are defining/suggesting/describing an art that exists for copyrighted characters which are still not being allowed in the public domain even after having a legacy of over 100 years and the private detective which is also an established character but has already made it to the public domain. Would the image generated be subjected to copyright infringement?
The given input would be tweaked and generate the image after the post-curation process in which the generated images are processed and presented as new datasets allowing us to select from the vast options it offers. At the stage of output, we are now faced with the dilemma of whether the generated image’s ownership/authorship/liability falls on whom, whether the user who gave the input or is it the AI that generated an image similar to that of pre-existing copyright art? The answers to the posed questions would frame the rest of the blog.
Understanding text and data mining (TDM):
Text and Data mining is the process that is mostly taken by researchers, students, and others who in case of producing a literary/artistic work, refer and learn from previously executed works on the same. This process is usually done through visiting various webpages also known as website crawling or website scraping. The process of analyzing webpages in a manner that, such analysis would evolve/devolve into a new trend or pattern in the dataset that was not originally present in it before. This process is used to train the AI to track and compile datasets which has been an issue internationally.
Gettyimages vs stability AI:
The main contention here is that the stable-diffusion AI trains itself from various stock images that are provided by Gettyimages and produces stock images which led to the copyright infringement of Gettyimages. The issue is raised on very speculative and unfathomed grounds that the whole decision of the court shall evoke the question of “fair use” of the copyright laws says leading IP experts and lawyers.
Text and Data Mining in India:
India being a part of various conventions that predominantly work in the rights of artists’ literary and artistic works prefers copyright laws in a manner that it doesn’t infringe on the work of the author/artist. The interpretation of Indian laws requires a pre-requisite of whether the work that is produced is an artistic or literary work that is in question. The Supreme Court in the case of Eastern Book Company v/s D.B. Modak adopted the test of “Modicum of Creativity” from the Supreme Court of the United States, where it states that mere ” co-ordination or arrangement of pre-existing work” would not be considered as original artistic or literary work.
The work of the author or the artist should be in a manner that efforts of them should be an unoriginal version of the original work. It held that the editorial, footnotes generated by the company shall be allowed and considered as artistic/literary works whereas the raw judgments being numbered and arranged in a certain manner would not amount to copyright as the judgment isn’t the original of the company and allowing copyrights to the same would bar and devoid the opportunity of other reporters from writing on the same. The application of sections shall be followed subsequently in case the work that is in question passes the test of Modicum of creativity.
The process of TDM certainly has effects of infringement of the datasets accessed according to section 51 and against section 14 of the Indian Copyright Act of 1957. However the pleadings mostly would fall upon section 52 of the same act to be considered as fair use. The Delhi high court in the case of the University of Oxford v. Rameshwari Photocopy Service has made it imperative that the work in contention of TDM to be considered “fair use” should be in a manner that it does not “unreasonably infringe the authorship/ownership of the artist/author’s work” and it’s irrelevant on amount of text and data accessed until it has not failed to do the same.
Evolution of the Indian copyright act and the future of AI – Machine learning in India:
The copyright act of 1957 section 2 interprets various terms used in the act and (d) defines the interpretation of “authors” in which sub-clause (vi) “about any literary, dramatic, musical or artistic work which is computer-generated, the person who causes the work to be created” makes it sufficiently clear that the act does not entertain or recognize any work of AI or any other computer that can generate such works as mentioned in the act as a “person”. Thus interpreting the law, it means that the user who has prompted the command in the AI has the right to issue for copyright as per the act.
Despite the strict restriction to such computer-generated work, the copyright act was challenged in the year 2021, when an IP lawyer Sahini and R.A.G.H.A.V, the machine learning AI, with no prompt, was granted copyright as co-authors. Indian courts have finally decided to explore in depth the details of whether a machine learning AI is allowed for authority over copyright. The case is still being heard in the courts for consideration. The recognition of machine learning AI would expand vast options and incentives in the field of AI and with them new set of regulations that govern it.
CONCLUSION:
The evolution of Artificial intelligence and Machine learning is inevitable, the 1st world nations are still adapting to such development, paving the groundwork and slowly updating laws as such required. The AI platforms are collaborating with various companies to find a solution to curb the issue of AI-TDM infringement, NVIDIA and Gettyimages are now in the process of developing AI that is being trained only on datasets of Gettyimages thus the copyrights of images stay within the ambits of Gettyimages.
The law plays a pivotal role in the development of AI and the labor of an Artist. The law should keep in mind the risks of restricting the growth of AI which might hamper the overall development of other industries in which AI has been a major contributor and form regulations on using such resources. The laws related to text and data mining should be developed within the ambit of Artificial intelligence. One can only hope for such drastic change, hopefully when the courts decide on the case of R.A.G.H.A.V the machine learning AI.
REFERENCES:
- Borghi, M. (2020) Text & data mining, CopyrightUser. Available at: https://www.copyrightuser.org/understand/exceptions/text-data-mining/ (Accessed: 13 December 2023).
- Vincent, J. (2023) Getty Images sues AI art generator stable diffusion in the US for copyright infringement, The Verge. Available at: https://www.theverge.com/2023/2/6/23587393/ai-art- copyright-lawsuit-getty-images-stable-diffusion (Accessed: 13 December 2023).
- Sarkar, S. (2022) Exclusive: India recognises AI as co-author of copyrighted artwork, MIP. Available at: https://www.managingip.com/article/2a5czmpwixyj23wyqct1c/exclusive-india-recognises-ai-as-co-author-of-copyrighted-artwork (Accessed: 13 December 2023).
- Joshi, D. and Gour, P. (2021) Crawl cautiously: Examining the legal landscape for text and data mining in India – part I, Spicyip. Available at: https://spicyip.com/2020/06/crawl-cautiously-examining-the-legal-landscape-for-text-and-data-mining-in-india-part-i.html#:~:text=This%20position%20was%20reaffirmed%20by,the%20boundaries%20of%20Section%2052. (Accessed: 13 December 2023).
- Gettyimages vs stability AI.
- Eastern Book Company v/s D.B. Modak
- University of Oxford v. Rameshwari Photocopy Service