OpenAI Accidentally Deletes ChatGPT Training Data Amid Publisher Copyright Claims, Sparking Concerns Over Evidence Retention In Legal Cases

By Ali Salman

OpenAI Accidentally Deletes ChatGPT Training Data Amid Publisher Copyright Claims, Sparking Concerns Over Evidence Retention In Legal Cases

OpenAI has been in a bit of controversy with the press, as The New York Times and the Daily News have sued the AI giant and its investors, claiming that ChatGPT was trained using their copyrighted content. The lawyers' research data, that went into training AI models, was deleted by OpenAI engineers, supposedly by accident. The move potentially deleted the evidence The New York Times lawyers acquired against OpenAI.

OpenAI is advancing rapidly in developing AI for businesses but faces obstacles to achieving a major breakthrough, while Apple's cautious approach is keeping Apple Intelligence steady. Tech giants are not shy of using copyrighted material to train different AI models with different sets of data. We have previously covered how AI companies not only used textual data but also YouTube videos, including MKBHD videos, to train their AI models.

OpenAI previously agreed to open its AI platform for The New York Times and Daily News in an attempt for them to search for their own copyrighted material in the AI training sets. The publishers' experts spent a hefty amount of time curating the data that OpenAI had used to train ChatGPT since early November. While evidence could have supported the publishers' claims, OpenAI accidentally erased relevant data sets that went into training ChatGPT.

Kyle Wiggers from TechCrunch states:

Earlier this fall, OpenAI agreed to provide two virtual machines so that counsel for The Times and Daily News could perform searches for their copyrighted content in its AI training sets...In a letter, attorneys for the publishers say that they and experts they hired have spent over 150 hours since November 1 searching OpenAI's training data.

But on November 14, OpenAI engineers erased all the publishers' search data stored on one of the virtual machines, according to the aforementioned letter, which was filed in the U.S. District Court for the Southern District of New York late Wednesday.

To put it simply, OpenAI is accused of deleting the evidence or research conducted by the experts from The New York Times. You can check out the letter published online for more details. OpenAI was able to retrieve the deleted data but in a format that can not be used legally, making it unsuitable in the case of copyrighted material. It remains to be seen how the publishers will respond to the mishap and if any additional measures in the pipeline could allow them to proceed with their claims.

As mentioned earlier, it remains to be seen how the legal teams pursue their case against OpenAI and possibly other tech giants for copyrighted material. We will keep you posted on the latest updates on the story, so be sure to stick around.

Previous articleNext article

POPULAR CATEGORY

entertainment

10885

discovery

4868

multipurpose

11295

athletics

11467