musicnet
Musical scores comprise a dense set of labels on performances of classical western music. These labelings are analogous to semantic segmentations of visual imagery. However, in order to use a score as a segmentation map, we must first align it to particular audio recording by warping precise timings in the score onto expressive timings chosen by the performers. This is the music alignment problem. Using an alignment algorithm, we created the MusicNet dataset by aligning scores to a collection of freely-licensed recordings. We can use these labels to, for example, train an automatic music transcription system. In work appearing at ICASSP 2018, we describe a state-of-the-art transcription model trained using MusicNet that to-date (2019) is the best-performing algorithm in the MIREX transcription challenge.