A Novel Singing Voice Separation Method Based on Sparse Non-Negative Matrix Factorization and Low-Rank Modeling

Mavaddati, S.

doi:10.22068/IJEEE.15.2.161

Volume 15, Issue 2 (June 2019) IJEEE 2019, 15(2): 161-171 | Back to browse issues page

‎ 10.22068/IJEEE.15.2.161

‎ 20.1001.1.17352827.2019.15.2.3.6

Mendeley

Zotero

RefWorks

Mavaddati S. A Novel Singing Voice Separation Method Based on Sparse Non-Negative Matrix Factorization and Low-Rank Modeling. IJEEE 2019; 15 (2) :161-171
URL: http://ijeee.iust.ac.ir/article-1-1240-en.html

A Novel Singing Voice Separation Method Based on Sparse Non-Negative Matrix Factorization and Low-Rank Modeling

S. Mavaddati

Abstract: (5838 Views)

A new single channel singing voice separation algorithm is presented in this paper. This field of signal processing provides important capability in various areas dealing with singer identification, voice recognition, data retrieval. This separation procedure is done using a decomposition model based on the spectrogram of singing voice signals. The novelty of the proposed separation algorithm is related to different issues listed in the following: 1) The decomposition scheme employs the vocal and music models learned using sparse non-negative matrix factorization algorithm. The vocal signal and music accompaniment can be considered as sparse and low-rank components of a singing voice segment, respectively. 2) An alternating factorization algorithm is used to decompose input data based on the modeled structures of the vocal and musical components. 3) A voice activity detection algorithm is introduced based on the energy of coding coefficients matrix in the training step to learn the basis vectors that are related to instrumental parts. 4) In the separation phase, these non-vocal atoms are updated to the new test conditions using the domain transfer approach to result in a proper separation procedure with low reconstruction error. The performance evaluation of the proposed algorithm is done using different measures and leads to significantly better results in comparison with the earlier methods in this context and the traditional procedures. The average improvement values of the proposed separation algorithm for PESQ, fwSegSNR, SDI, and GNSDR measures in comparison with previous separation methods in two defined test scenario and three mentioned SMR levels are 0.53, 0.84, 0.39, and 2.19, respectively.

Keywords: Singing Voice Separation , Dictionary Learning , Incoherence , Sparse Coding , Voice Activity Detector

Full-Text [PDF 1262 kb] (2750 Downloads)

Type of Study: Research Paper | Subject: Signal Processing
Received: 2018/02/28 | Revised: 2019/04/07 | Accepted: 2018/08/24

Rights and permissions
	This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

© 2022 by the authors. Licensee IUST, Tehran, Iran. This is an open access journal distributed under the terms and conditions of the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) license.

Iranian Journal of Electrical and Electronic Engineering

Iran University of Science and Technology

Aims & Scopes

Related Websites