OpenAI and Microsoft ironically accuse DeepSeek of copyright infringement — training its cost-effective model with privileged data

Deepseek logo with stock market ticker and headline on a TV.
DeepSeek has been accused of using OpenAI's models to train a competing model. (Image credit: Getty Images | Bloomberg)

DeepSeek took the AI world by storm this month, but the company now faces accusations of using data without permission. OpenAI claims to have evidence that DeepSeek leveraged OpenAI's models to train a competing AI model. If true, this would violate OpenAI's terms of service.

OpenAI told the Financial Times about evidence of DeepSeek using "distillation" to train AI models. In this context, that term refers to a company using a preexisting model's outputs to train a newer model. Distillation reduces the cost of model creation by building on the work already done for the "teacher model."

OpenAI's ironic accusation

OpenAI, the maker of ChatGPT, has accused DeepSeek of using OpenAI's models to train competing models. (Image credit: Getty Images | CFOTO)

The irony in all of this is that OpenAI has been accused of using data to train AI models without permission several times. The New York Times sued OpenAI in December 2023, arguing that using data to train generative models does not fall under fair use. Lawsuits from The Intercept, Raw Story, and AlterNet followed in February 2024.

One of the biggest criticisms of OpenAI is that it allegedly trained its AI models using data without proper authorization

OpenAI allegedly using data without permission to train AI models does not excuse any behavior of DeepSeek. OpenAI has clear terms that prohibit using its models to create technology that competes with OpenAI. That being said, the accusations from OpenAI are quite ironic.

TOPICS
Sean Endicott
News Writer and apps editor

Sean Endicott is a tech journalist at Windows Central, specializing in Windows, Microsoft software, AI, and PCs. He's covered major launches, from Windows 10 and 11 to the rise of AI tools like ChatGPT. Sean's journey began with the Lumia 930, leading to strong ties with app developers. Outside writing, he coaches American football, utilizing Microsoft services to manage his team. He studied broadcast journalism at Nottingham Trent University and is active on X @SeanEndicott_ and Threads @sean_endicott_.