DRULES AI
🏠 Home 📰 Blog
← All posts

Atlantic Exposes 21M Tracks Training Suno, Udio & Google AI

The Atlantic dropped a bombshell this week with four giant, searchable datasets proving AI music platforms trained on over 21 million unlicensed tracks. The revelation has musicians from Logic to choral singers discovering their work inside the training data, sparking widespread outrage and "I want in" posts about the inevitable lawsuits.

🌐 Scale of the Scraping Operation

One dataset alone contains 12 million tracks—equivalent to 91 years of continuous listening. Others hold 9 million and over 100,000 tracks each, spanning Bad Bunny, Beatles, Miles Davis, Billie Eilish, Pearl Jam, and tens of thousands of independent artists. Google trained models on 44 million tracks from the Free Music Archive. Suno claimed it used "essentially all music files of reasonable quality" downloadable from the internet. OpenAI scraped 1.2 million songs for its Jukebox model.

These aren't hypothetical scrapes. The datasets were circulating in AI developer communities, often violating platform terms that prohibit commercial use. While companies lean on fair use defenses, the concrete evidence of specific artists and tracks makes the legal exposure real.[[1]](https://www.theatlantic.com/technology/2026/06/ai-music-generators-suno-google-udio/687485/)[[1]](https://www.theatlantic.com/technology/2026/06/ai-music-generators-suno-google-udio/687485/)

😡 Artist Backlash and Existing Suits

Ed Newton-Rex, CEO of Fairly Trained, highlighted his own choir recordings appearing in the data. Logic fans noted hundreds of his songs, freestyles, and videos were included. Multiple X users demanded inclusion in ongoing litigation, with one declaring "Suno a.i. is using my music to generate bullshit a.i. songs. I want in on the inevitable lawsuit."

Record labels have already sued Suno and Udio for massive infringement. Google, Stability AI, and OpenAI face similar scrutiny. Settlements have occurred, but the new transparency tool lets any artist check their catalog, potentially triggering a wave of additional claims. Benn Jordan and other musicians are now "poisoning" their tracks to fight future scraping.render_inline_citation with citation_id is 17end

🏛️ Industry Reckoning Ahead

AI platforms rarely label generated tracks as such on streaming services, letting synthetic music compete directly with human artists while eroding their income. The Atlantic investigation underscores how secrecy around training data has hidden the full scope—until now. With searchable proof in the open, the fair use argument looks increasingly untenable.

Google referred queries back to terms of service. Suno claims safeguards against unauthorized use. Yet the datasets tell a different story: wholesale ingestion of commercial music without consent or compensation.

Bottom line: This database doesn't just confirm suspicions—it arms artists and labels with specific evidence, likely accelerating lawsuits and forcing AI music companies toward licensing deals or stricter ethical standards.