Multimedia Signal Processing Laboratory

P. Kabal

Paper Abstracts 2000

Conference papers

T. Islam and P. Kabal

"Partial-Energy Weighted Interpolation of Linear Prediction Coefficients", Proc. IEEE Workshop Speech Coding (Delavan, WI), pp. 105-107, Sept. 2000.

This paper discusses the interpolation of linear prediction (LP) coefficients. The performance of LP analysis using different numbers of subframes and the choice of representation for the LP coefficients are studied. Interpolation is done by converting the LP coefficients in one of the following representations: line spectral frequencies, reflection coefficients, log area ratios, and autocorrelations. It is shown that good performance is obtained for line spectral frequencies and five subframes per frame. A new interpolation technique which incorporates partial frame energy is introduced. This technique generalizes the concept of energy weighting to different LP coefficient representations.

N. Sheikholeslami Alagha and P. Kabal

"A Filterbank Structure for Voice-Band PCM Channel Pre-Equalization", Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing (Istanbul), pp. 2793-2796, June 2000.

A non-maximally decimated filterbank structure for pre-equalizing channels with Inter-Symbol Interference (ISI) is investigated. The impulse response of the channel is assumed to be known at the transmitter. Compared with the classical Tomlinson-Harashima Precoding technique, the proposed pre-equalizer compensates for the channel without increasing the number of the received signal levels (channel alphabet). The proposed technique does not require the channel to be minimum-phase. The filterbank structure adds redundancy to the input signal to compensate for the channel ISI while keeping the transmitted power bounded.

The proposed pre-equalization is particularly useful for data transmission over voice-band PCM channels. The up-stream PCM channel is bandlimited, causing severe ISI at the output of the front-end receiver filter. By using the pre-equalizer at the transmitter, channel ISI can be mitigated.

K. El-Maleh, M. Klein, G. Petrucci, and P. Kabal

"Speech/Music Discrimination for Multimedia Applications", Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing (Istanbul), pp. 2445-2448, June 2000.

Automatic discrimination of speech and music is an important tool in many multimedia applications. Previous work has focused on using long-term features such as differential parameters, variances, and time-averages of spectral parameters. These classifiers use features estimated over windows of 0.5-5 seconds, and are relatively complex. In this paper, we present our results of combining the line spectral frequencies (LSFs) and zero-crossing-based features for frame-level narrowband speech/music discrimination. Our classification results for different types of music and speech show the good discriminating power of these features. Our classification algorithms operate using only a frame delay of 20 ms, making them suitable for real-time multimedia applications.

H. Najafzadeh and P. Kabal

"Perceptual Bit Allocation for Low Rate Coding of Narrowband Audio", Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing (Istanbul), pp. 893-896, June 2000.

In this work we consider adaptive bit allocation for perceptual coding of narrowband audio signals at low rates (down to 8 kb/s). Two different strategies are used to shape the audible noise spectrum. In one approach, the quantization noise spectrum is shaped in parallel with the masking threshold curve. This way the noise is equally audible in different frequency bands. The other approach generates a flat noise spectrum above the masking threshold. The noise power is not equally distributed over the frequency range, hence it is audible to various extents at different frequencies.

Paper titles.