provided by DSpace at Rice University
Hindawi Publishing Corporation
EURASIP Journal on Applied Signal Processing
Volume 2006, Article ID 57134, Pages 1-18
DOI 10.1155/ASP/2006/57134
An Efficient Circulant MIMO Equalizer for CDMA Downlink:
Algorithm and VLSI Architecture
Yuanbin Guo,1 Jianzhong(Charlie) Zhang,1 Dennis McCain,1 and Joseph R. Cavallaro2
1 Nokia Research Center, 6000 Connections Drive, Irving, TX 75039, USA
2 Department of Electrical and Computer Engineering, George R. Brown School of Engineering,
Rice University, 6100 Main Street, Houston, TX 77005, USA
Received 29 November 2004; Revised 5 June 2005; Accepted 14 June 2005
We present an efficient circulant approximation-based MIMO equalizer architecture for the CDMA downlink. This reduces the
direct matrix inverse (DMI) of size (NF × NF) with O((NF)3) complexity to some FFT operations with O(NFlog2(F)) complexity
and the inverse of some (N ×N) submatrices. We then propose parallel and pipelined VLSI architectures with Hermitian optimiza-
tion and reduced-state FFT for further complexity optimization. Generic VLSI architectures are derived for the (4 × 4) high-order
receiver from partitioned (2 × 2) submatrices. This leads to more parallel VLSI design with 3× further complexity reduction.
Comparative study with both the conjugate-gradient and DMI algorithms shows very promising performance/complexity trade-
off. VLSI design space in terms of area/time efficiency is explored extensively for layered parallelism and pipelining with a Catapult
C high-level-synthesis methodology.
Copyright © 2006 Hindawi Publishing Corporation. All rights reserved.
1. INTRODUCTION
Wireless communication is experiencing radical advance-
ment to support broadband multimedia services and ubiqui-
tous networking via mobile devices. MIMO (multiple-input
multiple-output) technology [1-3] using multiple antennas
at both the transmitter and receiver has emerged as one of the
most significant technical breakthroughs for throughput en-
hancement. On the other hand, UMTS [4] and CDMA2000
extensions optimized for data services lead to the standard-
ization of multicode CDMA systems such as the high-speed
downlink packet access (HSDPA) and its equivalent 1X evo-
lution data and voice/data optimized (EV-DV/DO) stan-
dards [5]. This leads to an asymmetric capacity requirement,
where the downlink even plays a more essential role than the
uplink because of the downloading features. The application
of the MIMO technology in CDMA downlink receives in-
creasing interest as a strong candidate for the 3G and beyond
wireless communication systems.
Known as D-BLAST [3] and a more realistic strategy
as V-BLAST [2] for real-time implementation, the orig-
inal MIMO spatial multiplexing was proposed for nar-
rowband and flat fading channels. In a multipath fading
channel, the orthogonality of the spreading codes is de-
stroyed. This introduces both the multiple-access interfer-
ence (MAI) and the intersymbol interference (ISI). The con-
ventional Rake receiver [6] could not provide acceptable per-
formance because of the very short spreading gain to sup-
port high-rate data services in multicode CDMA downlink.
LMMSE (linear-minimum-mean-squared-error)-based chip
equalizer is promising to restore the orthogonality of the
spreading code and suppress both the ISI and MAI [6] in
single-antenna systems. However, this involves the inverse
of a large covariance matrix with O ((NF)3) complexity for
MIMO systems, where N is the number of receive antennas
and F is the channel length. Traditionally, the implementa-
tion of equalizer in hardware has been one of the most com-
plex tasks for receiver designs. The MIMO extension gives
even more challenges for real-time hardware implementation
[7], especially for the mobile receiver.
To avoid the DMI, adaptive algorithms such as least-
mean-square (LMS) algorithm have been studied. However,
they suffer from stability problems because the convergence
depends on the choice of a good step size [8]. On the other
hand, nonadaptive block-based algorithms such as the Levin-
son and Schur [9, 10] algorithms reduce the complexity to
the order of O ((NF)2). An iterative conjugate gradient (CG)
tap solver was proposed in [11, 12] at similar complexity.
However, this squared complexity is still very high for ef-
ficient real-time implementation. The fact that the down-
link receiver must be embedded into a low-cost portable
device makes the design of low-complexity equalizer chal-
lenging but essential for widespread commercial deploy-
ment.