Papers
arxiv:2412.06703

Source Separation & Automatic Transcription for Music

Published on Dec 9, 2024
Authors:
,
,
,
,

Abstract

Deep learning-based approach enables end-to-end music source separation, MIDI conversion, and sheet music transcription from audio mixtures.

AI-generated summary

Source separation is the process of isolating individual sounds in an auditory mixture of multiple sounds [1], and has a variety of applications ranging from speech enhancement and lyric transcription [2] to digital audio production for music. Furthermore, Automatic Music Transcription (AMT) is the process of converting raw music audio into sheet music that musicians can read [3]. Historically, these tasks have faced challenges such as significant audio noise, long training times, and lack of free-use data due to copyright restrictions. However, recent developments in deep learning have brought new promising approaches to building low-distortion stems and generating sheet music from audio signals [4]. Using spectrogram masking, deep neural networks, and the MuseScore API, we attempt to create an end-to-end pipeline that allows for an initial music audio mixture (e.g...wav file) to be separated into instrument stems, converted into MIDI files, and transcribed into sheet music for each component instrument.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2412.06703 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2412.06703 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2412.06703 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.