Welcome to the companion website for MedleyDB, a dataset of annotated, royalty-free multitrack recordings. MedleyDB was curated primarily to support research on melody extraction, addressing important shortcomings of existing collections. For each song we provide melody f0 annotations as well as instrument activations for evaluating automatic instrument recognition. The dataset is also useful for research on tasks that require access to the individual tracks of a song such as source separation and automatic mixing.
If you make use of MedleyDB for academic purposes, please cite the following publication:
R. Bittner, J. Salamon, M. Tierney, M. Mauch, C. Cannam and J. P. Bello, "MedleyDB: A Multitrack Dataset for Annotation-Intensive MIR Research", in 15th International Society for Music Information Retrieval Conference, Taipei, Taiwan, Oct. 2014.
MedleyDB is also indexed in The Open Multitrack Testbed, a online repository of Multitrack audio with search functionality.
If you make use of MedleyDB 2.0, please cite the following publication:
Bittner, R., Wilkins, J., Yip, H., & Bello, J. (2016). MedleyDB 2.0: New Data and a System for Sustainable Data Collection. New York, NY, USA: International Conference on Music Information Retrieval (ISMIR-16).
If you make use of MedleyDB for academic purposes, please cite the following publication:
R. Bittner, J. Salamon, M. Tierney, M. Mauch, C. Cannam and J. P. Bello, "MedleyDB: A Multitrack Dataset for Annotation-Intensive MIR Research", in 15th International Society for Music Information Retrieval Conference, Taipei, Taiwan, Oct. 2014.
MedleyDB is also indexed in The Open Multitrack Testbed, a online repository of Multitrack audio with search functionality.
If you make use of MedleyDB 2.0, please cite the following publication:
Bittner, R., Wilkins, J., Yip, H., & Bello, J. (2016). MedleyDB 2.0: New Data and a System for Sustainable Data Collection. New York, NY, USA: International Conference on Music Information Retrieval (ISMIR-16).
Introducing MedleyDB 2.0!
We are happy to announce the release of MedleyDB 2.0, a second iteration of this project which includes more multitrack recordings, a new system for sustainable data collection, and an automatic error-checking application.
In addition to the 74 multitracks added to the dataset in this release, MedleyDB Manager introduces a collaborative ticketing system to ensure that a multitrack makes it from the recording studio to our dataset without getting lost in complicated communications between artists, engineers, and us (dataset managers). Additionally, we now provide MedleyDeBugger, an application that automatically checks uploaded raw files, stem files, and the final mixed file for various errors including silent tracks, misalignment, and inclusion from raw to final mix. We hope that the introduction of these new tools will increase the sustainability and usability of MedleyDB.
In addition to the 74 multitracks added to the dataset in this release, MedleyDB Manager introduces a collaborative ticketing system to ensure that a multitrack makes it from the recording studio to our dataset without getting lost in complicated communications between artists, engineers, and us (dataset managers). Additionally, we now provide MedleyDeBugger, an application that automatically checks uploaded raw files, stem files, and the final mixed file for various errors including silent tracks, misalignment, and inclusion from raw to final mix. We hope that the introduction of these new tools will increase the sustainability and usability of MedleyDB.
Dataset Download
Both MedleyDB and MedleyDB 2.0 now have up-to-date and monitored Zenodo repositories, where the full datasets are available for download with a quick permission request.
Creators

This project was lead by Rachel Bittner at NYU's Music and Audio Research Lab, along with Justin Salamon, Mike Tierney and Juan Pablo Bello. Annotations were created in collaboration with Matthias Mauch and Chris Cannam at the Center for Digital Music at Queen Mary University. For a detailed list of contributors, please refer to the Acknowledgments page.
MedleyDB 2.0 includes contributions from Julia Wilkins (Northwestern University/Sonos) and Hanna Yip (Stanford University).
MedleyDB 2.0 includes contributions from Julia Wilkins (Northwestern University/Sonos) and Hanna Yip (Stanford University).
Dataset Snapshot
- Size: 122 Multitracks (mix + processed stems + raw audio + metadata)
- Now in MedleyDB 2.0: 74 new tracks -> 196 total!
- Annotations: Melody f0 (108 tracks), Instrument Activations (122 tracks), Genre (122 tracks)
- Audio Format: WAV (44.1 kHz,16 bit)
- Genres: Singer/Songwriter, Classical, Rock, World/Folk, Fusion, Jazz, Pop, Musical Theatre, Rap
- Track Length: 105 full length tracks (~3 to 5 minutes long), 17 excerpts (7:17 hours total) + 74 new tracks of various length
- Instrumentation: 52 instrumental tracks, 70 tracks containing vocals
Feedback
Please help us improve MedleyDB by sharing your feedback.
For specific questions, or to start correspondance with a human, please contact Rachel Bittner at rachel (dot) bittner (at) nyu (dot) edu.