The Atlantic has made four AI music training datasets publicly searchable, exposing the specific tracks used to build models from companies including Google and Stability AI. Two datasets contain 12 million and 9 million tracks respectively. Two smaller sets each exceed 100,000 songs.

Reporter Alex Reisner confirmed Google and Stability AI both cited these datasets in published research papers. The datasets have been downloaded thousands of times. Some sources, like the Free Music Archive, permit personal streaming but restrict redistribution, raising direct copyright questions.

The database is live and searchable now. What makes the full piece worth reading is the legal detail: Reisner maps which datasets were used by which companies and what the actual licensing terms were at the time of use. That paper trail is the story.

[READ ORIGINAL →]