scholarly journals Isolated guitar transcription using a deep belief network

2017 ◽  
Vol 3 ◽  
pp. e109 ◽  
Author(s):  
Gregory Burlet ◽  
Abram Hindle

Music transcription involves the transformation of an audio recording to common music notation, colloquially referred to as sheet music. Manually transcribing audio recordings is a difficult and time-consuming process, even for experienced musicians. In response, several algorithms have been proposed to automatically analyze and transcribe the notes sounding in an audio recording; however, these algorithms are often general-purpose, attempting to process any number of instruments producing any number of notes sounding simultaneously. This paper presents a polyphonic transcription algorithm that is constrained to processing the audio output of a single instrument, specifically an acoustic guitar. The transcription system consists of a novel note pitch estimation algorithm that uses a deep belief network and multi-label learning techniques to generate multiple pitch estimates for each analysis frame of the input audio signal. Using a compiled dataset of synthesized guitar recordings for evaluation, the algorithm described in this work results in an 11% increase in the f-measure of note transcriptions relative to Zhou et al.’s (2009) transcription algorithm in the literature. This paper demonstrates the effectiveness of deep, multi-label learning for the task of polyphonic transcription.

2015 ◽  
Author(s):  
Gregory Burlet ◽  
Abram Hindle

Automatic music transcription is a difficult task that has provoked extensive research on transcription systems that are predominantly general purpose, processing any number or type of instruments sounding simultaneously. This paper presents a polyphonic transcription system that is constrained to processing the output of a single instrument with an upper bound on polyphony. For example, a guitar has six strings and is limited to producing six notes simultaneously. The transcription system consists of a novel pitch estimation algorithm that uses a deep belief network and multi-label learning techniques to generate multiple pitch estimates for each audio analysis frame, such that the polyphony does not exceed that of the instrument. The implemented transcription system is evaluated on a compiled dataset of synthesized guitar recordings. Comparing these results to a prior single-instrument polyphonic transcription system that received exceptional results, this paper demonstrates the effectiveness of deep, multi-label learning for the task of polyphonic transcription.


2015 ◽  
Author(s):  
Gregory Burlet ◽  
Abram Hindle

Automatic music transcription is a difficult task that has provoked extensive research on transcription systems that are predominantly general purpose, processing any number or type of instruments sounding simultaneously. This paper presents a polyphonic transcription system that is constrained to processing the output of a single instrument with an upper bound on polyphony. For example, a guitar has six strings and is limited to producing six notes simultaneously. The transcription system consists of a novel pitch estimation algorithm that uses a deep belief network and multi-label learning techniques to generate multiple pitch estimates for each audio analysis frame, such that the polyphony does not exceed that of the instrument. The implemented transcription system is evaluated on a compiled dataset of synthesized guitar recordings. Comparing these results to a prior single-instrument polyphonic transcription system that received exceptional results, this paper demonstrates the effectiveness of deep, multi-label learning for the task of polyphonic transcription.


Author(s):  
Dorian Cazau ◽  
Marc Chemillier ◽  
Olivier Adam

This chapter presents an original approach for the development of an automatic music transcription system of a Malagasy traditional plucked string instrument, called marovany zither. Our approach is based on a technology of multichannel capturing sensory system, which allows breaking down a complex polyphonic audio signal into a sum of monophonic sensor signals. A very high precision in transcription is obtained, i.e. & gt; 95% on the average note-based F-measure metric. A second part of this chapter consists in using these transcripts in the human-machine improvisation system ImproteK. Details of an exploratory working session with a local Malagasy musician are reported and discussed.


2019 ◽  
Vol 28 (5) ◽  
pp. 925-932
Author(s):  
Hua WEI ◽  
Chun SHAN ◽  
Changzhen HU ◽  
Yu ZHANG ◽  
Xiao YU

Sign in / Sign up

Export Citation Format

Share Document