articleJan 1, 2011GREEN OA

The Million Song Dataset

Columbia University · Oracle (United States)

Indexed indatacite

Abstract

We introduce the Million Song Dataset, a freely-available collection of audio features and metadata for a million contemporary popular music tracks. We describe its creation process, its content, and its possible uses. Attractive features of the Million Song Database include the range of existing resources to which it is linked, and the fact that it is the largest current research dataset in our field. As an illustration, we present year prediction as an example application, a task that has, until now, been difficult to study owing to the absence of a large set of suitable data. We show positive results on year prediction, and discuss more generally the future development of the dataset.

Citation impact

831
total citations
FWCI
58.96
Percentile
100%
References
7
Citations per year

Authors

4

Topics & keywords

Keywords
  • Metadata
  • Computer science
  • Field (mathematics)
  • Task (project management)
  • Set (abstract data type)
  • Range (aeronautics)
  • Information retrieval
  • Process (computing)
No related works found for this paper.