Search published articles


Showing 2 results for Mavaddati

S. Mavaddati,
Volume 15, Issue 2 (June 2019)
Abstract

A new single channel singing voice separation algorithm is presented in this paper. This field of signal processing provides important capability in various areas dealing with singer identification, voice recognition, data retrieval. This separation procedure is done using a decomposition model based on the spectrogram of singing voice signals. The novelty of the proposed separation algorithm is related to different issues listed in the following: 1) The decomposition scheme employs the vocal and music models learned using sparse non-negative matrix factorization algorithm. The vocal signal and music accompaniment can be considered as sparse and low-rank components of a singing voice segment, respectively. 2) An alternating factorization algorithm is used to decompose input data based on the modeled structures of the vocal and musical components. 3) A voice activity detection algorithm is introduced based on the energy of coding coefficients matrix in the training step to learn the basis vectors that are related to instrumental parts. 4) In the separation phase, these non-vocal atoms are updated to the new test conditions using the domain transfer approach to result in a proper separation procedure with low reconstruction error. The performance evaluation of the proposed algorithm is done using different measures and leads to significantly better results in comparison with the earlier methods in this context and the traditional procedures. The average improvement values of the proposed separation algorithm for PESQ, fwSegSNR, SDI, and GNSDR measures in comparison with previous separation methods in two defined test scenario and three mentioned SMR levels are 0.53, 0.84, 0.39, and 2.19, respectively.

S. Mavaddati,
Volume 15, Issue 3 (September 2019)
Abstract

Blind voice separation refers to retrieve a set of independent sources combined by an unknown destructive system. The proposed separation procedure is based on processing of the observed sources without having any information about the combinational model or statistics of the source signals. Also, the number of combined sources is usually predefined and it is difficult to estimate based on the combined sources. In this paper, a new algorithm is introduced to resolve these issues using empirical mode decomposition technique as a pre-processing step. The proposed method can determine precisely the number of mixed voice signals based on the energy and kurtosis criteria of the captured intrinsic mode functions. Also, the separation procedure employs a grey wolf optimization algorithm with a new cost function in the optimization procedure. The experimental results show that the proposed separation algorithm performs prominently better than the earlier methods in this context. Moreover, the simulation results in the presence of white noise emphasize the proper performance of the presented method and the prominent role of the presented cost function especially when the number of sources is high.


Page 1 from 1     

Creative Commons License
© 2022 by the authors. Licensee IUST, Tehran, Iran. This is an open access journal distributed under the terms and conditions of the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) license.