Multimedia Signal Processing Laboratory

P. Kabal


Paper Abstracts 1997


Conference papers

H. Najafzadeh-Azghandi and P. Kabal

"Perceptual Coding of Narrowband Audio Signals at 8 kbit/s", Proc. IEEE Workshop Speech Coding (Pocono Manor, PA), pp. 109-110, Sept. 1997.
See also: [Demonstration] [slides]

This paper proposes a VQ-based transform coding scheme for audio signals (sampled at 8 kHz) at very low bit rates. This coder uses a new perceptually based distortion measure, which takes into account the energy of audible noise, in both training the codebooks and selecting the best codewords. An adaptive bit allocation strategy based on the distribution of the energy of the transform coefficients above the masking threshold is employed to assign more bits to perceptually important critical bands. This coder delivers good quality for most audio signals at 1 bit/sample.

M. R. Zad-Issa and P. Kabal

"A New LPC Error Criterion for Improved Pitch Tracking", IEEE Workshop Speech Coding (Pocono Manor, PA), pp. 1-2, Sept. 1997.

In Linear Predictive coders the output of the LP analysis filter is used to represent the glottal excitation signal. For high pitched voices during nasal sounds or nasalized vowels, the speech signal takes on a sinusoidal shape. The corresponding residual signal has a very low energy and the pitch pulses are weak or absent, resulting in poor pitch tracking. These segments of speech are also characterized by large frame-to-frame variations of the LP coefficients. In this paper we propose a composite formant prediction error criterion leading to a clear track of residual pulses even for for the sinusoid-like speech, while enhancing the smoothness of the filter parameter evolution.

N. Sheikholeslami and P. Kabal

"Linear Time Varying Precoder Applied to the ISI Channel", Proc. IEEE Pacific Rim Conf. Commun., Computers, Signal Processing (Victoria, BC), pp. 36-39, Aug. 1997.
See also: [slides]

A linear time varying precoding structure for data transmission over ISI channels is investigated. This precoding scheme is capable of stabilizing the inverse channel filter without increasing the dynamic range of the received information symbols. This is particularly useful if the receiver decision device is a simple slicer with fixed number of levels.

S. Valaee, B. Champagne, and P. Kabal

"Sinusoidal Signal Detection Using the Minimum Descriptor Length and the Predictive Stochastic Complexity", Proc. Int. Conf. Digital Signal Processing (Santorini, Greece), pp. 1023-1026, July 1997.

The techniques based on the minimum description length (MDL) and the predictive stochastic complexity (PSC) are proposed for sinusoidal signal detection. The MDL and PSC criteria are the codelength of the observation and the model. The proposed techniques decompose the observation vector into its components in the signal and noise subspaces. The noise component is encoded for several model orders. The best model is selected by minimizing the codelength.

K. El-Maleh and P. Kabal

"Comparison of Voice Activity Detection Algorithms for Wireless Personal Communications Systems", Proc. IEEE Canadian Conf. Electrical, Computer Engineering (St. John's, NF), pp. 470-473, May 1997.
See also: [slides]

Voice activity detection (VAD) algorithms have become an integral part of many of the recently standardized wireless cellular and Personal Communications Systems (PCS). In this paper, we present a comparative study of the performance of three recently proposed VAD algorithms under various acoustical background noise conditions. We also propose new ideas to enhance the performance of a VAD in wireless PCS speech applications.

S. Shahbazpanahi, S. Valaee, B. Champagne, and P. Kabal

"Extended Source Localization Using the ESPRIT Algorithm", Int. Conf. Telecommun. (Melbourne), pp. 1033-1037, April 1997.

A new approach to parametric localization of a distributed source is proposed. This method is based on the ESPRIT algorithm. The central angle and the angular extension of an incoherently distributed source are estimated. This algorithm has a low computational complexity. The method does not require array calibration.

S. Valaee, B. Champagne, and P. Kabal

"Using Information Theoretic Techniques for Sinusoidal Signal Resolution", Int. Conf. Telecommun. (Melbourne), pp. 1067-1072, April 1997.

The objective is to develop information theoretic criteria for detection of sinusoidal signals. The minimum description length (MDL) and the predictive stochastic complexity (PSC) have been formulated for harmonic resolution. MDL and PSC are the codelength for data and model. The proposed techniques are based on decomposing the observation vector into its components in the signal and noise subspaces. Each component is encoded separately and the results are added to form the total codelength. The codelength is minimized over different models to select the best model.

M. R. Zad-Issa and P. Kabal

"Smoothing the Evolution of the Spectral Parameters in Linear Prediction of Speech Using Target Matching", Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing (Munich), pp. 1699-1702, April 1997.

Linear prediction (LP) coefficients are used to describe the formant structure of a speech waveform. Many factors contribute to the frame-to-frame fluctuation of these parameters. These variations adversely affect the performance of the LP quantizer and the quality of the synthesized speech. For voiced speech, efficient coding of the pitch pulses at the output of the inverse formant filter relies on the similarity of successive pitch waveforms. The performance of this coding stage is also jeopardized by LP variations. In this paper, we propose a new method which smoothes the evolution of the LP parameters. Our algorithm is based on matching the output of the formant predictor to a target signal constructed using smoothed pitch pulses. With this approach we have successfully reduced the frame-to-frame variation of LP coefficients, while increasing the similarity of pitch pulses.


Paper titles.