Multimedia Signal Processing Laboratory

P. Kabal

Paper Abstracts 2006


Y. Ould-Cheikh-Mouhamedou, S. Crozier, and P. Kabal

"Efficient Distance Measurement Method for Turbo Codes that use Structured Interleavers", IEEE Commun. Letters, vol. 10, no. 6, pp. 477-479, June 2006.

This letter presents an efficient and accurate distance measurement method for tail-biting turbo codes that use structured interleavers. This method takes advantage of the structure in the interleaver as well as the circular property of tail-biting. As such, it significantly reduces the computational complexity, which allows the accurate determination of high minimum distance (dmin) in reasonable time. The efficiency of this method is demonstrated by its ability to determine the true dmin of 51 and the corresponding true multiplicities for a rate-1/3 turbo code that uses the UMTS 8-state polynomial generators and an MPEG-sized interleaver (1504 information bits) in reasonable time.

H. Najaf-Zadeh and P. Kabal

"Perceptual Coding of Narrow-Band Audio Signals at Low Rates", IEEE Trans. Audio, Speech, Language Processing, vol. 14, no. 2, pp. 609-622, March 2006.

This paper describes a coding paradigm using coding tools based on the characteristics of the human hearing system so as to accommodate a wide range of narrow-band audio inputs without annoying artifacts at low rates (down to 8 kb/s). The narrow-band perceptual audio coder (NPAC) employs a variety of algorithms to account for the perceptually irrelevant parts of the input signal in addition to statistical redundancies. The new algorithms used in the NPAC coder include a perceptual error measure in training the codebooks and selecting the best codewords which takes into account the audible parts of the quantization noise, a perception-based bit-allocation algorithm and a new predictive scheme to vector quantize the scale factors. The NPAC coder delivers acceptable quality without annoying artifacts for most narrow-band audio signals at around 1 bit/sample. Informal subjective tests have shown that the NPAC coder outperforms a commercial low-rate music coder operating at 8 kb/s.

Conference papers

M. Ghanassi and P. Kabal

"Optimizing Voice-over-IP Speech Quality Using Path Diversity", Proc. IEEE Workshop Multimedia Signal Processing (Victoria, BC), pp. 155-160, Oct. 2006.

In last few years, voice over Internet protocol (VoIP) has been gaining popularity as an alternative to traditional telephone by transmitting voice signals as packets over the Internet and private IP-based networks. However, voice packets experience loss, delay, and delay variation, which requires buffering, playout scheduling and loss concealment at the receiver. In this paper, we give an overview of a VoIP application and show how playout scheduling and loss concealment are jointly used to optimize perceived speech quality. We use this optimization criterion to design a histogram-based playout scheduling algorithm. Then, we identify the limitations of the VoIP application for this scheme and propose improvement using path diversity approach that can be implemented via a Service Overlay Network (SON. We present simulations results that show significant improvement of VoIP quality by using this approach.

Y. Qian, W.-S. Hsu, and P. Kabal

"Classified Comfort Noise Generation for Efficient Voice Transmission", Proc. Interspeech 2006 (Pittsburgh, PA), pp. 225-228, Sept. 2006.

Comfort noise insertion during speech pause has been applied to Voice-over-IP and wireless networks for increasing bandwidth efficiency. We present two classified comfort noise generation (CCNG) schemes using Gaussian Mixture classifiers (GMM-C). Our first scheme employs a classified prototype background noise codebook with the prototype noise waveform chosen using a GMM-C. The second scheme utilizes a classified enhanced excitation codebook. The new CCNG algorithms provide better comfort noise during speech pauses and a smaller misclassification rate. We have retrofitted the scheme into existing speech transmission system, such as ITU-T G.711/Appendix II and G.723.1/Annex A. The perceived quality of a voice conversation of the novel system has been noticeably enhanced for car and babble noise. For the G.711 system, a large improvement is obtained for car noise while the largest amelioration is for babble noise in the G.723.1 system.

A. H. Nour-Eldin, T. Z. Shabestary, and P. Kabal

"The Effect of Memory Inclusion on Mutual Information Between Speech Bands", Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing (Toulouse, France), pp. III-53-III-56, May 2006.

In this paper, we investigate the effect of temporal correlation on the dependence between the speech narrow and high frequency bands covering the 0.3-3.4 kHz and 3.7-8 kHz ranges, respectively. We follow the technique of using Gaussian mixture modelling of spectral envelopes represented by Mel-frequency cepstral coefficients. The correlation between the disjoint speech frequency bands is quantified through mutual information (MI) and its ratio to highband entropy. Speech exhibits considerable temporal correlation that is not explicitly accounted for by static parameterization of spectral envelopes. Including memory in speech parameterization (through delta features) incorporates such temporal information of speech in its modelling, and hence, MI gains are to be expected resulting in bandwidth extension with better performance. Results show that exploiting delta features can increase certainty about the highband (ratio of MI to highband entropy) by as much as 216% relatively, corresponding to an absolute increase of 12%.

Y. Ould-Cheikh-Mouhamedou, S. Crozier, K. Gracie, P. Guinand, and P. Kabal

"A Method for Lowering Turbo Code Error Flare Using Correction Impulses and Repeated Decoding", Proc. Int. Symp. Turbo Codes (Munich, Germany), 6 pp., April 2006.

Turbo codes exhibit an "error floor" or "flare" making it difficult to further improve the error performance without a significant increase in the signal-to-noise ratio (SNR). The error flare is mainly characterized by the distance properties of the code. The conventional way to lower the error flare is to increase the minimum distance, which is mainly determined by the interleaver. Unfortunately, the design of interleavers that yield high minimum distances is not a simple task. In fact, the design of such interleavers for applications requiring very low frame error rates (FERs) can be a real challenge. This paper introduces a new method that significantly lowers the error flare while leaving the interleaver unchanged. It also improves the error performance in the waterfall region. The key element of this method is the insertion of correction impulses in the received codeword and the use of repeated decoding. The effectiveness of this method is demonstrated by its ability to reduce the error rate by one order of magnitude in the waterfall region and more than three orders of magnitude in the flare region for 8-state single- and double-binary turbo codes of code rate 1/3 and 1/2 that use high-spread random (HSR) interleavers and packets of 1504 bits.

Paper titles.