"Two-dimensional Statistics of the Voronoi
Constellations Based on the Shaping Lattices *D _{N}* and

Shaping concerns the selection of the boundary of a signal
constellation. The major purpose of this selection is to reduce the average
energy of the set. In the Voronoi constellation, the Voronoi region of a
lattice, denoted as the shaping lattice, is used as the boundary of the
constellation. In this case, the set of the constellation points form a group
under vector addition modulo the shaping lattice. This property is used to
achieve the addressing. In this work, some properties of the Voronoi
constellations based on the shaping lattices *D _{N}* and

"Asymptotic Receiver Structures for Joint
Maximum Likelihood Time Delay Estimation and Channel Identification Using
Gaussian Signals", *IEEE Trans. Signal Processing*, vol. 40, no.
5, pp.
1258-1261, May 1992.

This correspondence addresses the problem of jointly estimating the relative time delay and the impulse response linking two received discrete-time Gaussian signals. Using two different methods, possible structures for the joint maximum-likelihood (ML) estimator are proposed, when the observation interval is long compared to both the delay to estimate and the correlation time of the various random processes involved. These structures generalize the cross-correlation method with prefiltering that implements the ML estimation of pure time delays.

R. P. Ramachandran and P. Kabal

"Bandwidth Efficient Transmultiplexers,
Part 2: Subband Complements and Performance Aspects", *IEEE Trans. Signal
Processing*, vol. 40, no. 5, pp. 1108-1121, May 1992.

This paper examines the performance issues related to the quadrature amplitude modulation (QAM) and vestigial sideband (VSB) transmultiplexers synthesized in [1]. First, an analysis of the limitations of the configured systems regarding intersymbol interference and crosstalk suppression arising from the use of practical filters is made. Based on these observations, a new design technique for an FIR low-pass prototype that takes the practical degradations into account is formulated. The procedure involves the unconstrained optimization of an error function. A performance evaluation reveals that for four of the five systems, the new method is superior to a minimax approach in that lower intersymbol interference and crosstalk distortions are achieved with a smaller number of filter taps. For the other transmultiplexer, the advantage of the optimized design over the minimax design is in the added flexibility of taking crosstalk into account, thereby diminishing the crosstalk distortion. The five transmultiplexers can be converted into new subband systems. The authors show how the optimized design approach formulated for the transmultiplexers over to the new subband systems.

R. P. Ramachandran and P. Kabal

"Bandwidth Efficient Transmultiplexers,
Part 1: Synthesis", *IEEE Trans. Signal Processing*, vol. 40, no.
1, pp.
70-84, Jan. 1992.

This paper develops a set of conditions which allow bandwidth efficient transmultiplexers to be synthesized. The synthesis procedure is based upon a generalized impulse response for the combining (modulating) and separation (demodulating) filters. In particular, the combining and separation filters are bandpass versions of one of two low-pass prototypes and are configured to cancel crosstalk by exploiting relationships between the center frequencies, delays, and phases in their impulse response. Based on the derived conditions, five different transmultiplexers are synthesized. Three of them implement multicarrier quadrature amplitude modulation (QAM). The other two accomplish multicarrier vestigial sideband modulation (VSB). Intersymbol interference is eliminated by appropriately designing the prototypes. The two band case is treated as a special case. For this case, the extra flexibility in choosing the center frequencies leads to the synthesis of additional transmultiplexers.

"Address Decomposition for the Shaping of
Multi-Dimensional Signal Constellations", *Proc. IEEE Globecom Conf.*
(Orlando, FL), pp. 1774-1778, Dec. 1992.

In this work, we introduce an efficient addressing scheme to
realize points near to the knee of the tradeoff curves of an optimally shaped
constellation. This scheme, called the address decomposition, is based on
decomposing the addressing into a hierarchy of addressing steps, each of a low
dimensionality. As the memory size associated with a direct addressing scheme
has an exponential growth with the dimensionality, this decomposition of
addressing results in a substantial decrease in the complexity. In this case, by
using a memory of a practical size, one can move along a tradeoff curve which is
nearly optimum. For example, in a space of dimensionality *N*=32, a block
of memory of 2.5 kilo-bytes per *N* dimensions is used to achieve a shape
gain of 0.92 dB with a constellation expansion ratio of 1.25 and a peak to
average power ratio of 2.95. This scheme has no associated computation, is
straightforward to implement, and is adaptable to the structure of a general
coset coding scheme.

A. De, I. Sasase, and P. Kabal

"Trellis-Coded Phase/Frequency Modulation with Equal
Usage of Signal Dimensions", *Proc. IEEE Globecom Conf.* (Orlando, FL),
pp. 1769-1773, Dec. 1992.

A trellis-coded modulation scheme, using a coded 2-FSK
(frequency shift keyed)/2* ^{m}*-PSK (phase shift keyed) modulation
format instead of a conventional coded 2

"Rate-Distortion Function for Speech Coding Based on
Perceptual Distortion Measure", *Proc. IEEE Globecom Conf.* (Orlando,
FL), pp. 452-456, Dec. 1992.

In [1], we have proposed a perceptual distortion measure for speech coders using an auditory (cochlear) model. This measure evaluates the neural-firing cross-entropy of the coded speech with respect to that of the original one. In this paper, the output space of the cochlear model is explored using this measure form so as to verify the existence of the pitch and formant information. However, the prime objective of this article is to provide a rate-distortion analysis for speech coding. We evaluate a lower bound to the rate-distortion function based on this distortion measure and also compute the exact rate-distortion function using the Blahut algorithm. Four state-of-the-art speech coders with rates ranging from 4.8 kb/s (CELP) to 32 kb/s (ADPCM) are studied from the viewpoint of their performances with respect to the rate-distortion limits.

Y. M. Cheng, D. O'Shaughnessy, and P. Kabal

"Speech Enhancement Using a Statistically
Derived Filter Mapping", *Proc. Int. Conf. Spoken Language Processing*
(Banff, AB), pp. 515-518, Oct. 1992.

We view the speech enhancement task in two aspects: reduction of the perceptual noise level in degraded speech and reconstruction of the degraded information, which may result in improvement of speech intelligibility. We are also very interested in noise-independent speech enhancement where test noise environments could differ in intensity from those of algorithm development. To this end, we have developed in this paper an algorithm called Noise-Independent Statistical Spectral Mapping (NISSM) to estimate a speech enhancement Wiener filter. NISSM consists of a noise-resistant transformation, which converts noisy speech to a set of noise-resistant features, and a spectral mapping function, which maps the features to autoregressive spectra of clean speech. We will show that the proposed algorithm effectively reduces noise intensity. When the noise intensity of training differs from that of testing, NISSM outperforms significantly a conventional spectral mapping. The algorithm operates frame-by-frame and is designed for real-time applications. The noise interference could be stationary or non-stationary white noise with variable intensity.

"A Unitary Transformation Algorithm for Wideband
Array Processing", *Proc. IEEE SP Workshop Statistical Signal,
Array Processing* (Victoria, BC), pp. 300-303, Oct. 1992.

A new method for broadband array processing is proposed. The method is based on a unitary transformation on the cross-correlation matrices of the array. It is shown that the Two-sided Correlation Transformation (TCT) generates unbiased estimates of the directions of arrival regardless of the bandwidth of the signals. The capability of the method for resolving two closely spaced sources is compared with that of the Coherent Signal-subspace Method (CSM). The resolution threshold for the new technique is smaller than the threshold for CSM.

"Backward Adaptive Prediction Cascaded
with Forward Formant and Pitch Configurations", *Proc. Canadian
Conf. Electrical, Computer Engineering* (Toronto, ON), pp. WM9.24.1-WM9.24.4,
Sept. 1992.

Two kinds of cascaded backward-adaptive predictor (forward formant-backward formant-forward pitch and backward formant-forward pitch) are investigated in this paper. We have analyzed and tested two important parameters for the backward-adaptive formant predictor in these configurations: the update rate of the linear prediction coefficients and the analysis frame length. We have found that if the analysis frame length of the backward-adaptive formant predictor is shorter than the pitch period, the backward prediction gain degrades rapidly. We have found that the average prediction gain for the slower update rates is close to the fast update one. The slower the update rate, the fewer the computations. Particularly, the backward predictor with slower update rate behaves more like a linear filter. These new results provide a useful platform to explore the applications of backward adaptive prediction to low bit-rate speech coders, in which the backward-adaptive formant predictor is cascaded with a forward pitch predictor or the forward formant-backward formant-forward adaptive pitch predictor is used [1],[2].

"ISI-Reduced Modulation over a Fading
Multipath Channel", *Proc. Int. Conf. Universal Personal
Commun.* (Dallas, TX), pp. 11.02.1-11.02.5, Sept. 1992.

In this work, the idea of using the channel eigenvectors as
the basis for a block based signaling scheme over a fading multipath channel is
introduced. This basis minimizes the product of the average fading attenuations
along different dimensions. The ISI from the preceding blocks (intra-block ISI)
is modeled by an additive Gaussian noise. To reduce the effect of the
intra-block ISI, a number of zeros are transmitted between successive blocks.
The number of zeros is optimized to minimize the average probability of error.
As the transmission of zeros reduces the bandwidth efficiency, this optimization
procedure is more useful for lower bit rates. By applying quadrature amplitude
modulation (QAM) to each dimension, we obtain a set of two-dimensional
subchannels with unequal fadings. A coherent *M*-PSK constellation is employed
over each QAM subchannel. We propose two methods to distribute the rate and
energy between the subchannels. In both methods, we impose the restriction that
the average error probability for all the subchannels is the same. In the
optimum method, the energy is distributed equally between the nonempty
subchannels and the rate is distributed to obtain equal average error
probabilities. In a second method, the rate is distributed equally and the
energy is distributed to obtain equal average error probabilities. The second
method allows us to use the same modulator/demodulator for all the subchannels
and thereby reduces the complexity. Numerical results are presented for the
second method. The results over a space of moderate dimensionality show
substantial performance improvement with a small increase in the complexity.

Y. Qian, Y. M. Cheng, and P. Kabal

"Backward Adaptation for Single-Pulse
Excitation Coder", *Proc. Int. Conf. Commun. Technol.*
(Beijing, China), pp. 26.03.1-26.03.4, Sept. 1992.

Backward-adaptive linear prediction has been successfully used in medium rate speech coders with high quality and low delay (less than 2 ms) at 16 kb/s. The prediction gain of a forward-adaptive formant predictor cascaded with a backward-adaptive formant predictor has been first studied. We have found that if the analysis frame length of the backward predictor is larger than the pitch period, the backward prediction gain can reach that of a non-linear predictor or a cascaded forward formant predictor. Results for several speech segments of male and female speakers, with different analysis window lengths, have been given and compared. The proposed cascaded adaptive filter configuration, the first forward-adaptive synthesizer followed by second backward-adaptive synthesizer, has been incorporated into a 3 kb/s Single-Pulse Excitation/Code-Excited Linear Prediction (SPE/CELP) coder to improve the speech quality while maintaining almost the same bit-rate. Experimental results for the proposed SPE/CELP coder with backward adaptation show that the improvement of the segment SNR for voiced speech segments of several testing sentences can reach to 1.02-2.06 dB.

"Using a Prefix Code for Addressing the Voronoi
Constellations Based on Lattices *D _{N}* and

Signal constellations for representing data values for
transmission benefit from shaping of the constellation boundary. In the Voronoi
constellations, the Voronoi region of a lattice, denoted as the shaping lattice,
is used as the boundary of the signal constellation. In this work, some
properties of Voronoi constellations based on the shaping lattices *D _{N}*
and

"Shaping of Multi-Dimensional Signal
Constellations Using a Lookup Table", *Proc. IEEE Int. Conf.
Commun.* (Chicago, IL), pp. 927-931, June 1992.

Shaping concerns to the selection of the boundary of a
signal constellation to reduce its average energy. Addressing is the assignment
of the data bits to the constellation points. A major concern of the shaping
regions is their addressing complexity. In this work, we use a lookup table for
addressing. The method is based on partitioning the two-dimensional
subconstellations into shaping shells of equal size and increasing average
energy. A lookup table is used to select a subset of the cartesian product of
the partitions. This partitioning is compatible with a multidimensional trellis
coded modulation (TCM) scheme. As part of the calculations, we have found a
closed-form for the weight distribution of the half integer lattice *Z ^{N}*+(1/2)

"Cochlear Discrimination: An Auditory
Information-Theoretic Distortion Measure for Speech Coders", *Proc.
Biennial Symp. Commun.* (Kingston, ON), pp. 419-423, May 1992.

In this paper, our objective is to devise a fidelity
criterion for quantifying the degree of distortion introduced by a speech coder.
Towards this end, both original speech and its coded versions are transformed
from the time-domain to a *perceptual-domain* using a cochlear model. This
perceptual-domain representation provides information pertaining to the
probability-of-firings in the neural channels. We introduce a cochlear
discrimination measure which compares these firing probabilities in an
information-theoretic sense. This measure, in essence, evaluates the
neural-firing cross-entropy of the coded speech with respect to that of the
original one. The performance of this objective measure is compared with
subjective evaluation results.

"Selection of the Focusing
Frequency in Wideband Array Processing - MUSIC and ESPRIT", *Proc.
Biennial Symp. Commun.* (Kingston, Ont.), pp. 410-414, May 1992.

Wide-band array processing using Coherent Signal-subspace Method (CSM) is discussed. It is shown that an optimal focusing subspace exists that improves the performance of he estimation. An error based on the subspace fitting is introduced. This error criterion gives the closest focused signal subspaces. Direct maximization of the criterion is very involved and the computational complexity increases with the number of frequency samples. A sub-optimal method is introduced that operates very close to the optimal case. This method is based on deriving tight bounds on the error. The computational complexity of the sub-optimal method is independent of the number of frequency samples. The sub-optimal method approaches the optimal case as the number of frequency samples increases. It is shown that the bias of the estimation is reduced by proper selection of the focusing subspace.

"Signaling in Multi-Dimensional Signal
Spaces", *Proc. Biennial Symp. Commun.* (Kingston,
ON), pp. 296-299, May 1992.

In selecting the boundary of a signal constellation used for
data transmission, the objective is to minimize the average energy of the set
for a given number of points from a given packing. Reduction in the average
energy because of using a region *C* as the boundary instead of a hypercube
is called the shape gain of *C*. The price to be paid for shaping is: (i)
an increase in the factor CER (Constellation-Expansion-Ratio), (ii) an increase
in the factor PAR (Peal-to-Average-power-Ratio), and (iii) and increase in the
addressing complexity. The structure of a region which optimizes the tradeoff
between the shape gain and the CER, and also between the shape gain and the PAR
in a finite dimensional space is discussed. Examples of the optimum tradeoff
curves are given. The optimum shaping region is mapped to a hypercube truncated
with a simplex. This mapping has properties which facilitate the addressing of
the signal points. We discuss two addressing schemes with low complexity and
good performance. In spectral shaping, the rate of the constellation is
maximized subject to some constraints on its power spectrum. This results in a
shaping region which has different values of power along different dimensions
(unsymmetrical shaping). This spectral shaping also involves the selection of an
appropriate basis (modulating waveform) for the space. Finally, we discuss the
selection of a signal constellation for signaling over a partial-response
channel using both continuous approximation and discrete analysis. We also
present a close form formula for the weight distribution of the scale *D*_{4}
and *E*_{8} lattices.

"Optimized PAM Transmission over a Fading Multipath Channel", *
Proc. Biennial Symp. Commun.* (Kingston, ON), pp. 293-295, May
1992.

In a mobile communication system, the intersymbol interference (ISI) due to the multipath nature of the wave propagation has a serious effect on the performance. In this work, we study the structure of an optimized time-multiplexed Pulse Amplitude Modulation (PAM) system for the transmission over such a channel. The ISI is modeled by an additive Gaussian noise. To reduce the effect of the ISI, a number of zeros are transmitted between successive time multiplexed impulses. By applying Quadrature Amplitude Modulation (QAM), we obtain two dimensions with identical statistics from each baseband time impulse. A coherent M-PSK signal constellation is employed over this two-dimensional space. The duration of the time impulses and also the number of zeros transmitted between them is selected to minimize the probability of the error between the constellation points. As the transmission of zeros reduces the bandwidth efficiency, this optimization procedure is more useful for lower bit rates. The performance of this scheme is compared with a PAM system without zero transmissions. The numerical results show substantial performance improvement without any increase in complexity.

Y. Qian, J. Liu, C. Feng, and P. Kabal

"Speech Coding Using an Enhanced
Sinusoidal Model at Low Bit-Rates", *Proc. Biennial Symp.
Commun.* (Kingston, Ont.), pp. 29-32, May 1992.

An enhanced sinusoidal model, which employs the time-varying amplitudes of three components to track the fast dynamical variations during the transition speech segments, and exploits the redundancies between the near-neighborhood components to reduce the number of sinusoidal components to a maximum of 20 with high synthesized quality is presented. Many components can be determined by linear prediction of the dominant and fundamental components, thereby reducing the number of the parameters required to be transmitted and the corresponding bit rate. This approach improves the synthesized quality of the unvoiced and transition speech segments.

An optimal algorithm for extracting dominant frequencies by formants and pitches is compared with a DFT method. The effects on the synthesis quality of the number of the time-varying amplitudes and the different base functions are compared.

Two vector quantization codebooks with group classifications are developed to reduce the storage and computation load for a 4.8 kbits/s coder. Objective measurements give a cepstrum distance of 2.62 dB for several phonetically balanced sentences. Informal listening tests have shown that the proposed speech coder with an enhanced sinusoidal model can obtain good quality speech at 4.8 kbits/s.

"Wideband CELP Speech Coding at 12 kb/s",
*Proc. Biennial Symp. Commun. *(Kingston, ON), pp. 25-28, May
1992.

This paper investigates the use of CELP (Code Excited Linear Prediction) as a coding scheme for wideband speech at an operating bit rate of 12 kbits/sec. With the help of different parameter coding techniques, the bit rate was lowered from 16 kbits/sec [2] to 12 kbits/sec while maintaining a similar speech quality. Three encoding schemes were used to improve the performance of the wideband CELP coder. The first approach used a combination of a three way split vector quantization and a new weighted distance measure for a set of line spectral frequencies (LSFs). The second approach used fractional pitch delays to improve the coder's performance for high pitched sounds. The third approach used perceptual noise weighting to improve coding in the high frequency region. The combination of all these three schemes resulted in a substantial increase in speech quality at a lower bit rate (12 kbits/sec).

"Detection of the Number of Signals Using
Predictive Stochastic Complexity", *Proc. IEEE Int. Conf. Acoustics,
Speech, Signal Processing* (San Francisco, CA), pp. V-345-V-348, March 1992.

In this paper, we propose a new algorithm for the processing of signals by an array of sensors. The objective is to find the number and the Directions Of Arrival (DOA) of signals impinging on a linear array. The Predictive Stochastic Complexity (PSC) criterion of Rissanen is used to select the best model order. To reduce the computational load, the algorithm operates with a suboptimal estimator while maintaining the consistency of the estimator. The proposed method is on-line and can be utilized in time-varying systems for target tracking. The method can be used for both correlated and uncorrelated signals.

"Time-Scale Modification of Speech Using an
Incremental Time-Frequency Approach with Waveform Structure Compensation",
*Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing* (San
Francisco, CA), pp. I-81-I-84, March 1992.

This paper first tries to identify the primary sources of
distortion in a non-recursive *time-scale modification* (TSM) algorithm
which is based on the short-time Fourier transform (STFT) (Portnoff, [1]). A
simpler version of this TSM algorithm is then proposed for processing speech,
where *incremental* estimators eliminate the need for explicit linear
time-scaling operations. Also featured in the design is a *waveform structure
compensation* stage to prevent excessive deterioration of the rate-changed
output. A *polar* (i.e., magnitude-phase) synthesis equation is used for
increased efficiency. The new TSM method is capable of generating high-quality
rate-changed speech at a reasonable computational cost.

"Lattice-Based Nonuniform Vector
Quantization", *Proc. Conf. Information Sciences, Systems*
(Princeton, NJ), pp. 677-682, March 1992.

We propose some practical methods for applying a
lattice-based uniform vector quantizer to a nonuniform source. The first method,
denoted as cluster quantization, is based on the the *k*-fold cartesian
product of one-dimensional compander in conjunction with a lattice quantizer.
This scheme has an asymptotic gain of 1.53 dB with respect to the optimum
one-dimensional quantizer. The complexity is essentially the complexity of
decoding of a lattice. The second method, denoted as quantizer shaping, is based
on selecting an appropriate boundary for a lattice quantizer. By increasing the
space dimensionality, this scheme becomes asymptotically optimum. As a practical
shaping method, we use the Voronoi region around the origin of a lattice to
shape the quantizer. By using binary lattices, we can construct quantizer with
an integral bit rate. In an extension of this scheme, we a lattice partition
chain *L*^{0}/.../*L ^{m}*/

"Selection of the Focusing
Frequency in
Wideband Array Processing", *Proc. Conf. Information Sciences,
Systems* (Princeton, NJ), p. 431, March 1992.

Wideband array systems can be decomposed into several narrowband systems by sampling in the frequency domain. Focusing is the combination of these narrowbands by transforming them into a focusing subspace. Corresponding to each focusing subspace there is a focusing frequency. So far, there has been no optimal way for choosing the focusing frequency - usually it is chosen to be the mid-band frequency. In this work we propose a technique to choose the focusing frequency. Our method is based on minimizing the subspace fitting error. The simulation results show that using the selected frequency for focusing improves the performance of the estimation by decreasing the resolution threshold and reducing the bias.

"Spectral Shaping with Unequal Power
Distribution", *Proc. Conf. Information Sciences, Systems*
(Princeton, NJ), pp. 294-299, March 1992.

We are going to maximize the entropy of a line code subject
to some constraints on the power spectrum. The general tools are the selection
of the constellation basis (modulating waveforms) and the power allocated to
each constellation dimension. In our analysis, the basis is fixed and is
selected to reduce the computational complexity of the modulation. The following
constraints on the power spectrum are considered in detail: (i) A fraction of
the total power equal to *F _{p}* is located in the frequency band
[0,

Paper titles.