print-icon
print-icon

"It's Illegal": Canadian Media Sues OpenAI For 'Scraping Large Swaths Of Content' To Train Chatbot

Tyler Durden's Photo
by Tyler Durden
Authored...

Five Canadian media companies are suing OpenAI, alleging that the ChatGPT creator has breached copyright and online terms of use in order to train the popular chatbot.

The joint lawsuit, filed on Friday in the Ontario Superior Court of Justice, follows similar suits brought against OpenAI and Microsoft in 2023 by the New York Times, which claimed copyright infringement of news content related to AI systems.

The Canadian outlets - which include the Globe and Mail, the Toronto Star and the Canadian Broadcasting Corporation (CBC), are seeking what could amount to billions of dollars in damages, as they have demanded 20,000 Canadian dollars (US$14,700) for each article they claim was illegally scraped and used to train ChatGPT.

"OpenAI is capitalizing and profiting from the use of this content, without getting permission or compensating content owners," the group said in a Friday statement, adding that they're responsible for the "bulk of Canada’s journalistic content."

The plaintiffs are also seeking a share of OpenAI's profits, as well as a halt to the use of future content.

"OpenAI regularly breaches copyright and online terms of use by scraping large swaths of content from Canadian media to help develop its products, such as ChatGPT," the group said in a statement.

"OpenAI’s public statements that it is somehow fair or in the public interest for them to use other companies’ intellectual property for their own commercial gain is wrong," they added. "Journalism is in the public interest. OpenAI using other companies’ journalism for their own commercial gain is not. It’s illegal."

The lawsuit claims that OpenAI circumvented specific technological and legal tools - such as the Robot Exclusion Protocol, copyright disclaimers and paywalls, which exist in part to prevent scraping or other types of unauthorized use of their published content.

In the NYT vs. OpenAI and Microsoft case, which is currently in discovery, the Times claims the company similarly broke laws to train ChatGPT, as well as provide search results.

The Canadian case has a narrower focus - scraping data for training - not search results, and does not name Microsoft.

"We believe we have a strong case related to the training of the models. The training of the models is the core of the problem," said Sana Halwani, a partner at the Canadian law firm Lenczner Slaght, which represents the media organizations in the lawsuit, in a statement to the NYT.

The Canadian publishers could find some of their claims easier to prove than others, with copyright infringement being the toughest, according to Lisa Macklem, a lecturer King’s University College at Western University in Ontario, who is an expert in copyright and media law. -NYT

"While it seems obvious that OpenAI is infringing copyright, it is technically very difficult to prove, and this underscores the immediate and pressing need to have regulations put in place, demanding, at the very least, transparency on what is in the training data of generative AI," said Macklem.

OpenAI's problems don't end there. Over the summer, Elon Musk - who co-founded OpenAI in 2015 but left in 2018 under bad circumstances, sued OpenAI, claiming that two of its founders, Sam Altman and Greg Brockman, breached the company's founding contract by putting commercial interests ahead of the public good.

0
Loading...