Multimedia Signal Processing Laboratory

P. Kabal

Paper Abstracts 1993

Papers

A. Khandani and P. Kabal

"Shaping Multidimensional Signal Spaces - Part II: Shell Addressed Constellations", IEEE Trans. Information Theory, vol. 39, no. 6, pp. 1809-1819, Nov. 1993.

By appropriately selecting the boundary of a multidimensional signal constellation used for data transmission, the average energy of the constellation can be reduced. Reduction in the average energy (shaping gain) is obtained at the price of increasing the constellation-expansion ratio (CERs) and the peak-to-average-power ratio (PAR). In this paper, we describe some practical means to select the boundary so as to achieve a point with low addressing complexity near the knee of the corresponding tradeoff curves (shaping gain versus CERs or PAR). One class of addressing schemes is based on using a lookup table. We introduce a method to facilitate the realization of the addressing lookup table. This method is based on the decomposition of the addressing into a hierarchy of addressing steps, each of a low complexity. This avoids the exponential growth of the complexity. Based on this addressing decomposition, by using a memory of a practical size, we can move along a tradeoff curve which has negligible suboptimality. Another class of addressing schemes is based on using a Voronoi constellation in a space of half the original dimensionality. We also introduce hybrid multilevel addressing schemes which combine the two classes. These schemes provide single points with moderate addressing complexity near the knee of the optimum tradeoff curves.

A. Khandani and P. Kabal

"Shaping Multidimensional Signal Spaces - Part I: Optimum Shaping, Shell Mapping", IEEE Trans. Information Theory, vol. 39, no. 6, pp. 1799-1808, Nov. 1993.

In selecting the boundary of a signal constellation used for data transmission, one tries to minimize the average energy of the constellation for a given number of points from a given packaging. The reduction in the average energy per two dimensions due to the use of a region C as the boundary instead of a hypercube is called the shaping gain gamma of C. The price to be paid for shaping involves: 1) an increase in the factor constellation-expansion ratio (CERs), 2) an increase in the factor peak-to-average-power ratio (PAR), and 3) an increase in the addressing complexity. In general, there exists a tradeoff between gamma and CERs, PAR. In this work, the structure of the region which simultaneously optimizes both of these tradeoffs is introduced. In an N-D (N-dimensional) optimum shaping region (N even), a 2-D sphere is the boundary of the 2-D subspaces and an N-D sphere is the boundary of the whole space. Analytical expressions are derived for the optimum tradeoff curves. By applying a change of variable denoted as shell mapping, the optimum shaping region is mapped to a hypercube truncated within a simplex. This mapping facilitates the performance computation, and also the addressing of the optimum shaping region. Using shell mapping, we introduce an addressing scheme with low complexity to achieve a point on the optimum tradeoff curves. To obtain more flexibility in selecting the tradeoff point, a second shaping method with more degrees of freedom is used. In this method, a 2-D sphere is the boundary of the 2-D subspaces, an N'-D sphere, N'>=2, is the boundary of the N'-D subspaces, and an N-D sphere is the boundary of the whole space.

F.-M. Wang, P. Kabal, R. P. Ramachandran, and D. O'Shaughnessy

"Frequency Domain Adaptive Postfiltering for Enhancement of Noisy Speech", Speech Communication, vol. 12, no. 1, pp. 41-56, March 1993.

This paper presents a new frequency-domain approach to implement an adaptive postfilter for enhancement of noisy speech. The postfilter is described by a set of DFT coefficients which suppress noise in the spectral valleys and allow for more noise in format regions which is masked by the speech signal. First, we perform an LPC analysis of the noisy speech and calculate the log magnitude spectrum of the input speech. After identifying the formats and valleys (by a new method), the log magnitude spectrum is modified to obtain the postfilter coefficients. The filtering operation is also done in the frequency domain through an FFT and an overlap-add strategy to get the postfiltered speech. Experimental results on 8-kHz-sampled speech show that this new frequency-domain approach results in enhanced speech of better perceptual quality than obtained by a time-domain method. This new method is especially efficient in eliminating high frequency noise and in preserving the weaker, high frequency formants in sonorant sounds.

D. Boudreau and P. Kabal

"Joint Time-Delay Estimation and Adaptive Recursive Least Squares Filtering", IEEE Trans. Signal Processing, vol. 41, no. 2, pp. 592-601, Feb. 1993.

A general estimation model is defined in which two observations are available: one being a noisy version of the transmitted signal, while the other is a noisy-filtered and delayed version of the same transmitted signal. The delay the the filter are unknown quantities that must be estimated. An adaptive system, based on the least squares (LS) estimation criterion, is proposed in order to perform a joint estimation of the two unknowns. The joint estimator is conceptually composed of an adaptive delay element operating in conjunction with an adaptive transversal filter. The weighted sum of squared errors is minimized with respect to both the delay and the adaptive filter weight vector. The filter is adapted using a fast version of the recursive least squares (RLS) algorithm, while the delay is updated using a form of derivative, with respect to the delay, of the sum of squared errors. In order to perform this task efficiently, the adaptive delay is limited to integer values and is corrected one sample at a time. The integer delay value is defined as the lag. A series of relations is presented, in order to compute and update the lag value such that the optimum least squares solution is attained. The joint delay estimation and RLS filtering algorithm is obtained by combining the lag update relations with a version of the fast transversal filter RLS algorithm. The simulations of the resulting algorithm show that both stationary and time-varying delays are effectively tracked and that the adaptive filter properly estimates the reference filter impulse response.

D. Boudreau and P. Kabal

"Joint Gradient-Based Time-Delay Estimation and Adaptive Minimum Mean-Squared-Error Filtering", Canadian J. Electrical, Computer Engineering, vol. 18, no. 1, pp. 27-35, Jan. 1993.

A general estimation model is defined in which two observations are available; one is a noisy version of the transmitted signal, while the other is a noisy filtered and delayed version of the same transmitted signal. The time-varying delay and the filter are unknown quantities that must be estimated. A joint estimator is proposed. It is composed of an adaptive delay element in conjunction with a transversal adaptive filter. The same error signal is used to adjust the delay element and the filter such that the minimum mean squared error is attained. Two joint gradient-based adaptation algorithms are studied. The joint steepest-descent (SD) algorithm is first investigated. The possibility of a multitude of stable solutions is established and a condition of convergence is presented. A stochastic implementation of the joint SD algorithm, under the form of a joint least-mean-square (LMS) algorithm, is then presented. It is analyzed in terms of convergence in the mean and in the mean square of both the delay estimate and the adaptive filter weight vector estimate. The conditions of convergence of the joint LMS algorithm are established as a function of the power spectral densities of the observed signals and the minimum mean squared error. The joint LMS algorithm is simulated under various conditions and it is shown that the adaptive delay element is very effective in reducing the mean squared error at the output of a long adaptive filter coping with two asynchronous inputs.

Chapter in book

J. Grass, P. Kabal, M. Foodeei, and P. Mermelstein

"High Quality Low-Delay Speech Coding at 12 kb/s", in Speech and Audio Coding for Wireless and Network Applications, pp. 3-9, Kluwer Academic, 1993.

Conference papers

G. Chahine and P. Kabal

"Pitch Filtering in CELP Using a Time Scaling Approach", Proc. IEEE Workshop Speech Coding (Ste. Adèle, QC), pp. 55-56, Oct. 1993.

In this paper, a new approach to pitch filtering in the synthesis stage of a Code-Excited Linear Prediction (CELP) coder is described. With this method, time scaling/shifting of the original speech is performed on a block by block basis in order to provide a better match between the pitch pulses in the original and reconstructed speech. This technique allows for a reduction of the pitch lag resolution, with the attendant bit rate reduction, of the CELP coder while maintaining the same speech quality.

M. Leong and P. Kabal

"Smooth Speech Reconstruction Using Prototype Waveform Interpolation", Proc. IEEE Workshop Speech Coding (Ste. Adèle, QC), pp. 39-40, Oct. 1993.

The Prototype Waveform Interpolation (PWI) speech coding technique aims to achieve natural sounding speech by producing smoothly evolving waveforms for voiced speech. The technique is to control the level of periodicity in reconstructing the voiced speech segments. It was found, however, that PWI does not always produce the smoothly evolving waveforms desired because the method does not truly control the level of periodicity. A revised PWI scheme is proposed which provides control over the level of periodicity and achieves smooth speech reconstruction.

S. Valaee, P. Kabal and B. Champagne

"Localization of Distributed Sources", Proc. GRETSI Symp. Signal, Image Processing (Juan les Pins, France), pp. 289-292, Sept. 1993.

Most array processing algorithms are based on the assumption that the signals are generated by point sources. This is a mathematical constraint which is not satisfied in many applications. In this paper, we consider situations where the sources are distributed in space with a parametric angular cross-correlation kernel. We propose an algorithm that localizes the distributed sources by estimating their parameter vector. The method is based on minimizing a scalar product between the array manifold and the noise eigenvectors of the correlation matrix. We study two cases corresponding to completely correlated and totally uncorrelated signal distributions. We compare our method to the conventional MUSIC algorithm. The simulation studies show that the new method outperforms the MUSIC algorithm by reducing the estimation bias and the standard deviation.

A. K. Khandani, P. Kabal and E. Dubois

"Fixed-Rate Entropy Coded Vector Quantization", Canadian Workshop Information Theory (Rockland, ON), p. 24, May 1993 (abstract).

A. K. Khandani and P. Kabal

"Efficient, Nearly Optimum Addressing Schemes Based on Partitioning the Constellation into the Union of Blocks", Proc. IEEE Int. Conf. Commun. (Geneva), pp. 1076-1080, May 1993.

In this paper, we introduce two efficient addressing schemes for the nearly optimum shaping of multi-dimensional signal spaces. Using K concentric circles, the 2-D (two-dimensional) subspaces are partitioned into K shells of equal volume. The 2-D shells are indexed in the radial direction from zero to K-1. The average energy of a 2-D shell is proportional to its index plus a fixed offset. In an N=2n-D space, we obtain Kⁿ shaping clusters of equal volume. Shaping is achieved by selecting T <= Kⁿ of the N-D clusters with the least average energy (least sum of the 2-D indices). This results in a set of T integer n-tuples with components in the range [0,K-1] and the sum of the components being at most a given number L_max. The problem of addressing is to find a one-to-one mapping between the set of these n-tuples and the set of the integers [0,T-1] such that the mapping and its inverse can be easily implemented. In the proposed schemes, the N-D clusters are grouped into blocks such that the addressing within the blocks, which is achieved using a common algorithm for all the blocks, has a low complexity. The addressing of the blocks is based on some recursive relationship which allows us to decompose the problem into smaller parts each of a low complexity. The overall scheme requires a moderate amount of memory and has a small computational complexity. The introduced methods are compared with the previously known schemes. The reduction in the complexity is substantial.

Y. Qian and P. Kabal

"Pseudo-Three-Tap Pitch Prediction Filters", Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing (Minneapolis, MN), pp. II-523-II-526, April 1993.

Pitch filters play an important role in high quality medium and low rate speech coders. We propose a pseudo-three-tap pitch filter with one or two degrees of freedom of the prediction coefficients, which gives a higher pitch prediction gain and also a more desirable frequency response than a one-tap pitch prediction filter. First, we describe an analysis model for the pseudo-three-tap pitch filter. Then, we apply the pseudo-three-tap concept together with a fractional pitch lag. The pitch prediction gain and frequency response of the pseudo-three-tap pitch filters are compared to a one-tap and three-tap pitch predictors with an integer or a non-integer pitch lag. The pseudo-three-tap pitch filter with one degree of freedom outperforms a conventional one-tap pitch filter. Even better is a pseudo-three-tap filter which switches between two parameter values.

A. K. Khandani and P. Kabal

"Efficient, Nearly Optimum Addressing Schemes Based on Partitioning the Constellation into the Union of Blocks", Conf. Information Sciences and Systems (Johns Hopkins University, Baltimore, MD), p. 555, March 1993.

A. K. Khandani, P. Kabal and E. Dubois

"Efficient Decomposition Algorithms for the Fixed Rate, Entropy-Coded Vector Quantization", Conf. Information Sciences and Systems (Johns Hopkins University, Baltimore, MD), pp. 348-354, March 1993.

In a vector quantizer of moderate dimensionality, the entropy coding of the quantizer output can result in a substantial decrease in bit rate. A straight-forward entropy coding scheme faces us with the problem of the variable data rate. A solution in a space of dimensionality N is to select a subset of elements in the N-fold cartesian product of a scalar quantizer and present them with code-words of the same length. For a memoryless source, a reasonable rule it to select the N-fold symbols with the lowest additive self-information information. The search/addressing of this scheme can no longer be achieved independently along the one-D subspace. Fortunately, the selected subset has a high degree of structure which can be used to substantially decrease the complexity. We discuss three classes of schemes to exploit this structure. First class is based on a combinatorial approach and uses the permutative structure of the code-words to facilitate the search/addressing. Second class is based on a dynamic programming approach. We build our recursive structure required for the dynamic programming in a hierarchy of steps. This results in several benefits over the conventional trellis-based approaches. Using this structure, we develop efficient rules (based on merging the states) to substantially reduce the search/addressing complexities while keeping the degradation negligible. Third class is based on a linear programming approach. Using the simplex method in conjunction with some auxiliary theorems, we develop a simple search procedure which gradually reduces the distortion associated with a given input.

K. M. Aleong, H. Leib and P. Kabal

"A Technique for Combining Equalization with Generalized Differential Detection", Proc. IEEE Int. Phoenix Conf. Computer, Commun. (Scottsdale, AZ), pp. 416-422, March 1993.

This work considers a generalized differential coherent detection technique combined with equalization for data transmission over intersymbol interference (ISI) channels when carrier phase tracking is difficult. In the absence of ISI, the SNR performance of this generalized differential coherent structure is very close to the coherent detection bound. This paper presents the integration of this generalized differential detection method with channel equalization, and therefore it extends the applications of this detection technique to channels with ISI. It is shown that for high SNR, the performance degradation of this structure with respect to coherent detection and equalization is negligible. Analysis and computer simulations show that this combined detection/equalization technique is an effective solution to the ISI problem when carrier phase tracking is difficult. One possible application of this work is in indoor wireless communications.

Paper titles.