Multimedia Signal Processing Laboratory

P. Kabal

Paper Abstracts 1990


R. P. Ramachandran and P. Kabal

"Transmultiplexers: Perfect Reconstruction and Compensation of Channel Distortion", Signal Processing, vol. 21, no. 3, pp. 261-274, Nov. 1990.

This paper presents new results for a multi-input, multi-output transmultiplexer and describes the analogy with known results for a single-input, single-output subband system. First, the perfect reconstruction property in both systems is explored. Then, the complementary nature of the two system is examined and is interpreted both in terms of network duality and by a series of block diagram manipulations which convert a subband system to a transmultiplexer. It is known that an interchange of the set of combining filters and separation filters preserves the crosstalk-free nature of transmultiplexers. The problem of channel distortion is alleviated by passing the received composite signal through a channel compensation filter or equalizer. Five methods for specifying the compensation filter are proposed, each of which reinstates the crosstalk-free nature of the transmultiplexer. However, residual intersymbol interference remains. Two of the five approaches attempt to suppress the intersymbol interference. A comparison of the performance of the five methods is done for a particular channel.

D. O'Shaughnessy, P. Kabal, D. Bernardi, L. Barbeau, C. C. Chu, and J.-L. Moncet

"Applying Speech Enhancement to Audio Surveillance", J. Forensic Sciences, vol. 35, no. 5, pp. 1163-1172, Sept. 1990.

Audio surveillance tapes are prime candidates for speech enhancement because of the many degradations and sources of interference that mask the speech signals on such tapes. In this paper, the authors describe ways to cancel interference when an available reference signal is not synchronized with the surveillance recording., for example, when the reference is obtained later from a phonograph record or an air check recording from a broadcast source. As a specific example, we discuss our experiences processing a wiretap recording used in an actual court case. We transformed the reference signal to reflect room and transmission effects and then subtracted the resulting secondary signal from the primary intercept signal, thus enhancing the speech of the desired talkers by removing interfering sounds. Before the secondary signal could be subtracted, the signals had to be aligned properly in time. The intercept signal was subjected to time-scale modifications made necessary by the varying phonograph and tap recorder speeds. While these speech differences are usually small enough not to affect the perceived quality, they adversely affect the ability to cancel interference automatically. In working with recording devices, we took into account four factors that affect the signal quality: the frequency response, nonlinear distortion, noise, and speed variations. The two methods that were most successful for enhancement were the least-mean-squares (LMS) adaptive cancellation and spectral subtraction.

Conference papers

D. Bees, P. Kabal, and M. L. Blostein

"Application of Complex Cepstrum to Acoustic Dereverberation", Proc. Biennial Symp. Commun. (Kingston, ON), pp. 324-327, June 1990.

Speech in rooms is subject to degradation caused by acoustic reverberation. Signal processing techniques to remove reverberation have required multiple microphones or knowledge of the room impulse response. In this paper, complex cepstral deconvolution is applied to acoustic dereverberation. A new approach to the segmentation and windowing procedure for speech improves the complex cepstral identification of the reverberant impulse response, and least squares inverse filters are used to remove the estimated impulse response from the reverberant speech. Although complete removal of the impulse response is not possible, reduction of reverberation with this technique is demonstrated.

M. Foodeei and P. Kabal

"Low-Delay Speech Coders at 16 kb/s: A CELP and a Tree Coder", Proc. Biennial Symp. Commun. (Kingston, ON), pp. 320-323, June 1990.

For speech coders to be used in network applications, a transparent or near transparent quality (Mean Opinion Score rating of 4.0) is required. Though this is a necessary criterion, other desired properties include: low-delay, robustness to channel errors, moderate complexity, capability to handle non-speech signals in the telephone band, and good tandeming performance. The CCITT's current consideration for standardization of the 16 kb/s network-quality speech coders (to be finalized in 1991), requires a maximum delay of 5 ms and has set 2 ms as the objective. A better understanding of the trade-offs resulting from the use of different schemes is required. In this work, results and schemes of a delayed-decision tree coder based on the (M,L) algorithm (ML-TREE) [1,2] and a Low-Delay Code Excited Linear Predictive (LC-CELP) coder proposed by the AT&T Laboratories for the 16 kb/s CCITT's standardization [3], are studied and compared. For the comparison, the LD-CELP and the ML-TREE coders are simulated. Results obtained to date show the the segSNR of the coded speech using ML-TREE and LD-CELP are comparable. The design of LD-CELP has emphasis on the channel error robustness while the ML-TREE coder design has not considered this issue closely yet. The performance of the two coders under noisy channel conditions reflects this.

J. Grass and P. Kabal

"Comparison of LPC Coefficient Quantizers", Proc. Biennial Symp. Commun. (Kingston, ON), pp. 312-315, June 1990.

Experimental results of the quantization of Linear Predictive Coded (LPC) coefficients using two general approaches, scalar coefficient quantization and vector quantization, are presented. The LPC coefficients were quantized in several domains: Line Spectral Frequency (LSF), cepstral, predictor, reflection, and autocorrelation. Two distortion measures were used to evaluate the quantizers; Itakura-Saito and RMS log spectral distortion measure. The vector quantizers showed good results for only 9 bits per frame of 150 speech samples.

G. Roy and P. Kabal

"Low-Rate Analysis-by-Synthesis Wideband Speech Coding", Proc. Biennial Symp. Commun. (Kingston, ON), pp. 308-311, June 1990.

This paper presents possible implementations for a low rate wideband analysis-by-synthesis speech coder. The wideband speech signals have a bandwidth of 8 kHz, and the target operating bit rate is 16 kbits/sec. The basic Residual Excited Linear Predictive coder (RELP) is used as starting point to develop and test flexible pith parameter optimization procedures, which can operate in either full-band or split-band mode. These procedures are then applied to an analysis-by-synthesis CELP (Code Excited Linear Prediction) model. The performance of full-band and split-band CELP structures are compared.

D. Boudreau and P. Kabal

"Joint Time Delay Estimation and Adaptive Recursive Least Squares Filtering", Proc. Biennial Symp. Commun. (Kingston, ON), pp. 164-167, June 1990.

A general estimation model is defined in which two observations are available; one being a noisy version of the transmitted signal, while the other is a noisy filtered and delayed version of the same transmitted signal. Examples of such systems occur in noise or echo cancellation, digital communication or geophysical exploration. The delay and the filter are unknown quantities that must be estimated. An adaptive system, based on the least squares (LS) estimation criterion, is proposed in order to perform a joint estimation of the two unknowns. The joint estimator is conceptually composed of an adaptive delay element operating in conjunction with a transversal adaptive filter. The weighted sum of squared errors is minimized with respect o the delay and the adaptive filter weight vector. The latter is adapted using a fast version of the recursive least squares (RLS) algorithm, while the former is updated using a form of derivative, with respect to the delay, of the sum of squared errors. In order to perform this task efficiently, the adaptive delay is limited to integer values and is corrected one sample at a time. The integer delay value is defined as the lag. A series of relations is presented, in order to compute and update he lag value such that the optimum lease squares solution is attained. The joint delay estimation and RLS filtering algorithm is obtained by combining the lag update relations with a version of the fast transversal filter RLS algorithm. The simulations of the resulting algorithm show that both stationary and time-varying delays are effectively tracked and that the adaptive filter estimates properly the reference filter impulse response.

D. Boudreau and P. Kabal

"Joint Gradient-Based Time Delay Estimation and Adaptive Filtering", Proc. IEEE Int. Symp. Circuits, Systems (New Orleans, LA), pp. 3165-3169, June 1990.

A general estimation model is defined in which two observations are available; one is a noisy version of the transmitted signal, while the other is a noisy filtered and delayed version of the same transmitted signal. The time-varying delay and the filter are unknown quantities that must be estimated. A joint estimator is proposed. It is composed of an adaptive delay element in conjunction with a transversal adaptive filter. The same error signal is used by the two adaptive algorithms to adjust the delay element and the filter such that the minimum mean square error is attained. Two joint gradient-based adaptation algorithms are studied. The joint steepest-descent (SD) algorithm is first investigated. The possibility of convergence to a multitude of solutions is established, and a condition of convergence is presented. A stochastic implementation of the joint SD algorithm, under the form of a joint least mean square (LMS) algorithm, is then investigated. It is analyzed in terms of convergence in the mean and in the mean square of both the delay estimate and the adaptive filter weight vector estimate. The conditions of convergence of the joint LMS algorithm are established as functions of the power spectral densities of the observed signals and the minimum mean squared error.

R. P. Ramachandran and P. Kabal

"Configuration and Performance of Modulated Filter Banks", Proc. IEEE Int. Symp. Circuits, Systems (New Orleans, LA), pp. 1809-1812, June 1990.

This paper examines the performance issues relating to the transmultiplexers previously synthesized in [1]. These transmultiplexers consist of modulated filter banks based on one or two low-pass prototypes. First, the limitations of the configured systems regarding intersymbol interference and crosstalk suppression arising from the use of practical filters are analyzed. Based on these observations, a design technique for FIR (finite-impulse response) prototypes that takes the practical degradations into account is formulated. The procedure involves the unconstrained optimization of an error function. The resulting performance is compared with minimax filters. For the one-prototype systems, the method is superior because it leads to better intersymbol interference and crosstalk suppression with a smaller number of filter taps. In the case of the transmultiplexer with two prototypes, the main advantage of the design method is the inclusion of crosstalk terms in the error function. Finally we note that the five transmultiplexers can be converted into subband systems and show how the design approach formulated for transmultiplexers carries over to the new subband systems.

R. P. Ramachandran and P. Kabal

"Synthesis of Bandwidth Efficient OQAM and VSB Transmultiplexers", Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing (Albuquerque, NM), pp. 1639-1642, April 1990.

This paper develops a set of conditions which allow bandwidth efficient transmultiplexers to be synthesized. The synthesis procedure is based upon a generalized impulse response for the combining (modulating) and separation (demodulating) filters. In particular, the combining and separation filters are bandpass versions of one or two lowpass prototypes and are configured to cancel crosstalk by exploiting relationships between the center frequencies, delays, and phases in their impulse response. Based on the derived conditions, five different transmultiplexers are synthesized. Three of them implement orthogonal quadrature amplitude modulation (OQAM) with repeated center frequencies. The other two accomplish vestigial sideband modulation (VSB) with distinct frequencies. The transmultiplexers can be converted into new complementary subband systems. Intersymbol interference is eliminated in both the transmultiplexers and the subband systems by appropriately designing the prototypes.

Paper titles.