H. Ghanei Yakhdan, M. Khademi, J. Chitizadeh,
Volume 5, Issue 1 (3-2009)
Abstract
The performance of video transmission over wireless channels is limited by the
channel noise. Thus many error resilience tools have been incorporated into the MPEG-4
video compression method. In addition to these tools, the unequal error protection (UEP)
technique has been proposed to protect the different parts in an MPEG-4 video packet with
different channel coding rates based on the rate compatible punctured convolutional
(RCPC) codes. However, it is still not powerful enough for the noisy channels. To provide
more robust MPEG-4 video transmission, this paper proposes a modified unequal error
protection technique based on the mutual information of two video frames. In the proposed
technique, the dynamic channel coder rates are determined online based on the mutual
information of two consecutive video frames. With this technique, irregular and high
motion areas that are more sensitive to errors can get more protection. Simulation results
show that the proposed technique enhances both subjective visual quality and average peak
signal to noise ratio (PSNR) about 2.5 dB, comparing to the traditional UEP method.
B. Nasersharif, N. Naderi,
Volume 17, Issue 2 (6-2021)
Abstract
Convolutional Neural Networks (CNNs) have been shown their performance in speech recognition systems for extracting features, and also acoustic modeling. In addition, CNNs have been used for robust speech recognition and competitive results have been reported. Convolutive Bottleneck Network (CBN) is a kind of CNNs which has a bottleneck layer among its fully connected layers. The bottleneck features extracted by CBNs contain discriminative and rich context information. In this paper, we discuss these bottleneck features from an information theory viewpoint and use them as robust features for noisy speech recognition. In the proposed method, CBN inputs are the noisy logarithm of Mel filter bank energies (LMFBs) in a number of neighbor frames and its outputs are corresponding phone labels. In such a system, we showed that the mutual information between the bottleneck layer and labels are higher than the mutual information between noisy input features and labels. Thus, the bottleneck features are a denoised compressed form of input features which are more representative than input features for discriminating phone classes. Experimental results on the Aurora2 database show that bottleneck features extracted by CBN outperform some conventional speech features and also robust features extracted by CNN.