
OpenAI recently argued before the Delhi High Court that using publicly available data to train ChatGPT does not constitute a commercial activity in itself. The case stems from a lawsuit filed by ANI Media, which alleges that OpenAI used its content without permission to train its AI models.
OpenAI's legal team contended that training a large language model (LLM) is a neutral activity that can be used for both commercial and non-commercial purposes. They emphasized that ChatGPT drives traffic to ANI's website, meaning no commercial harm is being caused to the news agency. Additionally, OpenAI argued that copyright law protects the expression of content, not the discovery of ideas and facts.
The court has scheduled the next hearing for May 16, where rejoinder arguments from ANI and other parties will be considered.
This case could have significant implications for AI training and copyright law in India.
What does copyright law say about using public data for AI training?
Copyright law varies across jurisdictions, but generally, it protects the expression of ideas rather than the ideas themselves. When it comes to AI training, the key legal questions revolve around whether scraping publicly available data constitutes copyright infringement and whether AI-generated outputs violate existing protections.Key Considerations:
- Fair Use & Exceptions: Some countries, like the United States, allow limited use of copyrighted material under fair use, which considers factors like purpose, amount used, and market impact. However, this is often debated in AI contexts.
- Text & Data Mining (TDM) Exemptions: The European Union has introduced opt-out mechanisms for copyright holders, allowing AI developers to use publicly available data unless explicitly restricted.
- India’s Copyright Act: Section 52(1)(c) of India's Copyright Act provides exemptions for transient or incidental storage of copyrighted works, which some argue could apply to AI training..
- Legal Challenges: OpenAI is currently facing lawsuits, including one in the Delhi High Court, where ANI Media claims its content was used without permission. Courts are still determining whether AI training qualifies as a commercial activity or falls under research exemptions.
The debate is ongoing, and legal frameworks are evolving to address AI’s impact on copyright.