FMA: A Dataset For Music Analysis

Michaël Defferrard, Kirell Benzi, Pierre Vandergheynst, Xavier Bresson.
International Society for Music Information Retrieval Conference (ISMIR), 2017.

We introduce the Free Music Archive (FMA), an open and easily accessible dataset suitable for evaluating several tasks in MIR, a field concerned with browsing, searching, and organizing large music collections. The community's growing interest in feature and end-to-end learning is however restrained by the limited availability of large audio datasets. The FMA aims to overcome this hurdle by providing 917 GiB and 343 days of Creative Commons-licensed audio from 106,574 tracks from 16,341 artists and 14,854 albums, arranged in a hierarchical taxonomy of 161 genres. It provides full-length and high-quality audio, pre-computed features, together with track- and user-level metadata, tags, and free-form text such as biographies. We here describe the dataset and how it was created, propose a train/validation/test split and three subsets, discuss some suitable MIR tasks, and evaluate some baselines for genre recognition. Code, data, and usage examples are available at https://github.com/mdeff/fma.

Paper: arXiv:1612.01840 (latex and reviews)
Slides: doi:10.5281/zenodo.1066119
Poster: doi:10.5281/zenodo.1035847

Data

All metadata and features for all tracks are distributed in fma_metadata.zip (342 MiB). The below tables can be used with pandas or any other data analysis tool. See the paper or the usage.ipynb notebook for a description.

tracks.csv: per track metadata such as ID, title, artist, genres, tags and play counts, for all 106,574 tracks.
genres.csv: all 163 genres with name and parent (used to infer the genre hierarchy and top-level genres).
features.csv: common features extracted with librosa.
echonest.csv: audio features provided by Echonest (now Spotify) for a subset of 13,129 tracks.

Then, you got various sizes of MP3-encoded audio data:

fma_small.zip: 8,000 tracks of 30s, 8 balanced genres (GTZAN-like) (7.2 GiB)
fma_medium.zip: 25,000 tracks of 30s, 16 unbalanced genres (22 GiB)
fma_large.zip: 106,574 tracks of 30s, 161 unbalanced genres (93 GiB)
fma_full.zip: 106,574 untrimmed tracks, 161 unbalanced genres (879 GiB)

See the wiki (or #41) for known issues (errata).

Code

The following notebooks, scripts, and modules have been developed for the dataset.

usage.ipynb: shows how to load the datasets and develop, train, and test your own models with it.
analysis.ipynb: exploration of the metadata, data, and features. Creates the figures used in the paper.
baselines.ipynb: baseline models for genre recognition, both from audio and features.
features.py: features extraction from the audio (used to create features.csv).
webapi.ipynb: query the web API of the FMA. Can be used to update the dataset.
creation.ipynb: creation of the dataset (used to create tracks.csv and genres.csv).
creation.py: creation of the dataset (long-running data collection and processing).
utils.py: helper functions and classes.

Usage

Click the binder badge to play with the code and data from your browser without installing anything.

Clone the repository.

git clone https://github.com/mdeff/fma.git
cd fma

Create a Python 3.6 environment.

# with https://conda.io
conda create -n fma python=3.6
conda activate fma

# with https://github.com/pyenv/pyenv
pyenv install 3.6.0
pyenv virtualenv 3.6.0 fma
pyenv activate fma

# with https://pipenv.pypa.io
pipenv --python 3.6
pipenv shell

# with https://docs.python.org/3/tutorial/venv.html
python3.6 -m venv ./env
source ./env/bin/activate

Install dependencies.
```
pip install --upgrade pip setuptools wheel
pip install numpy==1.12.1  # workaround resampy's bogus setup.py
pip install -r requirements.txt
```
Note: you may need to install ffmpeg or graphviz depending on your usage.
Note: install CUDA to train neural networks on GPUs (see Tensorflow's instructions).

Download some data, verify its integrity, and uncompress the archives.

cd data

curl -O https://os.unil.cloud.switch.ch/fma/fma_metadata.zip
curl -O https://os.unil.cloud.switch.ch/fma/fma_small.zip
curl -O https://os.unil.cloud.switch.ch/fma/fma_medium.zip
curl -O https://os.unil.cloud.switch.ch/fma/fma_large.zip
curl -O https://os.unil.cloud.switch.ch/fma/fma_full.zip

echo "f0df49ffe5f2a6008d7dc83c6915b31835dfe733  fma_metadata.zip" | sha1sum -c -
echo "ade154f733639d52e35e32f5593efe5be76c6d70  fma_small.zip"    | sha1sum -c -
echo "c67b69ea232021025fca9231fc1c7c1a063ab50b  fma_medium.zip"   | sha1sum -c -
echo "497109f4dd721066b5ce5e5f250ec604dc78939e  fma_large.zip"    | sha1sum -c -
echo "0f0ace23fbe9ba30ecb7e95f763e435ea802b8ab  fma_full.zip"     | sha1sum -c -

unzip fma_metadata.zip
unzip fma_small.zip
unzip fma_medium.zip
unzip fma_large.zip
unzip fma_full.zip

cd ..

Note: try 7zip if decompression errors. It might be an unsupported compression issue.

Fill a .env configuration file (at repository's root) with the following content.

AUDIO_DIR=./data/fma_small/  # the path to a decompressed fma_*.zip
FMA_KEY=MYKEY  # only if you want to query the freemusicarchive.org API

Open Jupyter or run a notebook.
```
jupyter notebook
make usage.ipynb
```

Impact, coverage, and resources

100+ research papers

Full list on Google Scholar. Some picks below.

2 derived works

~10 posts

Music Genre Classification With TensorFlow, Towards Data Science, 2020-08-11.
Music Genre Classification: Transformers vs Recurrent Neural Networks, Towards Data Science, 2020-06-14.
Using CNNs and RNNs for Music Genre Recognition, Towards Data Science, 2018-12-13.
Over 1.5 TB’s of Labeled Audio Datasets, Towards Data Science, 2018-11-13.
Discovering Descriptive Music Genres Using K-Means Clustering, Medium, 2018-04-09.
25 Open Datasets for Deep Learning Every Data Scientist Must Work With, Analytics Vidhya, 2018-03-29.
Learning Music Genres, Medium, 2017-12-13.
music2vec: Generating Vector Embeddings for Genre-Classification Task, Medium, 2017-11-28.
A Music Information Retrieval Dataset, Made With FMA, freemusicarchive.org, 2017-05-22.
Pre-publication release announced, twitter.com, 2017-05-09.
FMA: A Dataset For Music Analysis, tensorflow.blog, 2017-03-14.
Beta release discussed, twitter.com, 2017-02-08.
FMA Data Set for Researchers Released, freemusicarchive.org, 2016-12-15.

5 events

Summer Workshop by the Haverford Digital Scholarship Library, 2020-07.
Genre recognition challenge at the Web Conference, Lyon, 2018-04.
Slides presented at the Data Jam days, Lausanne, 2017-11-24.
Poster presented at ISMIR 2017, Suzhou, 2017-10-24.
Slides for the Open Science in Practice summer school at EPFL, 2017-09-29.

~10 dataset lists

Contributing

Contribute by opening an issue or a pull request. Let this repository be a hub around the dataset!

History

2017-05-09 pre-publication release

paper: arXiv:1612.01840v2
code: git tag rc1
fma_metadata.zip sha1: f0df49ffe5f2a6008d7dc83c6915b31835dfe733
fma_small.zip sha1: ade154f733639d52e35e32f5593efe5be76c6d70
fma_medium.zip sha1: c67b69ea232021025fca9231fc1c7c1a063ab50b
fma_large.zip sha1: 497109f4dd721066b5ce5e5f250ec604dc78939e
fma_full.zip sha1: 0f0ace23fbe9ba30ecb7e95f763e435ea802b8ab
known issues: see #41

2016-12-06 beta release

paper: arXiv:1612.01840v1
code: git tag beta
fma_small.zip sha1: e731a5d56a5625f7b7f770923ee32922374e2cbf
fma_medium.zip sha1: fe23d6f2a400821ed1271ded6bcd530b7a8ea551

Acknowledgments and Licenses

We are grateful to the Swiss Data Science Center (EPFL and ETHZ) for hosting the dataset.

Please cite our work if you use our code or data.

@inproceedings{fma_dataset,
  title = {{FMA}: A Dataset for Music Analysis},
  author = {Defferrard, Micha\"el and Benzi, Kirell and Vandergheynst, Pierre and Bresson, Xavier},
  booktitle = {18th International Society for Music Information Retrieval Conference (ISMIR)},
  year = {2017},
  archiveprefix = {arXiv},
  eprint = {1612.01840},
  url = {https://arxiv.org/abs/1612.01840},
}

@inproceedings{fma_challenge,
  title = {Learning to Recognize Musical Genre from Audio},
  subtitle = {Challenge Overview},
  author = {Defferrard, Micha\"el and Mohanty, Sharada P. and Carroll, Sean F. and Salath\'e, Marcel},
  booktitle = {The 2018 Web Conference Companion},
  year = {2018},
  publisher = {ACM Press},
  isbn = {9781450356404},
  doi = {10.1145/3184558.3192310},
  archiveprefix = {arXiv},
  eprint = {1803.05337},
  url = {https://arxiv.org/abs/1803.05337},
}

The code in this repository is released under the MIT license.
The metadata is released under the Creative Commons Attribution 4.0 International License (CC BY 4.0).
We do not hold the copyright on the audio and distribute it under the license chosen by the artist.
The dataset is meant for research purposes.

FMA: A Dataset For Music Analysis

Related tags

Overview

FMA: A Dataset For Music Analysis

Data

Code

Usage

Impact, coverage, and resources

Contributing

History

Acknowledgments and Licenses

Owner

Michaël Defferrard

Code for "Multi-View Multi-Person 3D Pose Estimation with Plane Sweep Stereo"

Contains code for the paper "Vision Transformers are Robust Learners".

WarpRNNT loss ported in Numba CPU/CUDA for Pytorch

Official implementation of "Accelerating Reinforcement Learning with Learned Skill Priors", Pertsch et al., CoRL 2020

Official PyTorch implementation of "VITON-HD: High-Resolution Virtual Try-On via Misalignment-Aware Normalization" (CVPR 2021)

Composable transformations of Python+NumPy programsComposable transformations of Python+NumPy programs

This repository contains project created during the Data Challenge module at London School of Hygiene & Tropical Medicine

CVPR 2021 - Official code repository for the paper: On Self-Contact and Human Pose.

YOLOv5🚀 reproduction by Guo Quanhao using PaddlePaddle

PyTorch Implementation of ECCV 2020 Spotlight TuiGAN: Learning Versatile Image-to-Image Translation with Two Unpaired Images

HashNeRF-pytorch - Pure PyTorch Implementation of NVIDIA paper on Instant Training of Neural Graphics primitives

A deep-learning pipeline for segmentation of ambiguous microscopic images.

Churn prediction

MLPs for Vision and Langauge Modeling (Coming Soon)

Yoloxkeypointsegment - An anchor-free version of YOLO, with a simpler design but better performance

A scikit-learn-compatible module for estimating prediction intervals.

The devkit of the nuScenes dataset.

Scheme for training and applying a label propagation framework

This is an official implementation of "Polarized Self-Attention: Towards High-quality Pixel-wise Regression"

buildseg is a building extraction plugin of QGIS based on PaddlePaddle.