An Efficient Circulant MIMO Equalizer for CDMA Downlink: Algorithm and VLSI Architecture



EURASIP Journal on Applied Signal Processing


The direct inverse of the matrix is very expensive for hard-
ware implementation.

To reduce the computation complexity, an FFT-based fast
algorithm is presented in this section. It is known that a cir-
culant matrix
S can be diagonalized by the FFT operation
as
S = DHΛD, where D is the FFT phase coefficient ma-
trix and
Λ is a diagonal matrix whose diagonal elements are
the FFT result of the first column of the circulant matrix
S.
This known lemma is applied to simplify the MIMO equal-
izer computation dramatically. It is shown that the covari-
ance matrix
Rrr can be approximated by a block-circulant
matrix after we add two corner matrices as


0

'rr    R rr +    .

CL


0CLH
.

.. 0

00


(7)


where


/Eh [ L ] 0     0

CL =      .     ...     0

Eh [1] ■■■ Eh [ L ])


(8)


Using the extension of the diagonalization lemma and the
features of
Kronecker product, the block-circulant matrix can
be decomposed as


Crr = (Dh 0 I)( ɪ Wi 0 E[iɔ (D ® I),


(9)


where W = diag(1, W-1 ■ ■ ■ W- LF 1)) and WLF = e(2n/LF)
is the phase factor coe
fficient for the DFT computation.
denotes the
Kronecker product. By denoting


F=


( ɪ Wi 0 E[if),


(10)


it can be shown that the final MIMO equalizer taps are com-
puted as the following equation:

wopt (Dh 0 I) F-1 (D 0 I)hm.        (11)


F = diag(F[0], F[1], ..., F[LF]) is a block-diagonal matrix
with elements taken from the element-wise FFT of the first
column of the block-circular matrix
Crr. For an (M × N)
MIMO system, this reduces the inverse of an (
NLF × NLF)
matrix to the inverse of subblock matrices with size (
N × N).


3.3. System-on-chip (SoC) architecture partitioning

To achieve the real-time implementation, either DSP pro-
cessors or VLSI architectures could be applied. For exam-
ple, a multiple-processor architecture using TI’s DSP proces-
sors has been reported in [
21] for the 3G base station im-
plementation. However, the requirement for low power con-
sumption and compact size makes it di
fficult to use multiple
DSPs in a mobile handset to achieve the real-time processing


power for the chip-level physical layer design, especially for
the MIMO systems. SoC architecture is a major revolution
for integrated circuits due to the unprecedented levels of
integration and many advantages on the power consump-
tion and compact size. However, the straightforward imple-
mentation of the proposed equalizer has many redundancies
in computation. Many optimizations are needed to make it
more suitable for real-time implementation. We emphasize
the interaction between architecture, system partitioning,
and pipelining in this section with these objectives: (1) pro-
pose VLSI-oriented optimizations to further reduce the com-
putation complexity; (2) implement the equalizer with the
minimum hardware resource to meet the real-time require-
ment; (3) obtain an e
fficient architecture with optimal par-
allelism and pipelining for the critical computation parts. To
explore the e
fficient architectures, we elaborate the tasks as
the following procedure.

(1) Compute the independent correlation elements [E[0],
... , E[L]], and form the first block column of circu-
lant
Cr(1r) by adding the corner elements as Cr(1r) =
[E[0],...,E[L],0,...,0,EH [L],...,EH [1]]T. Each ele-
ment is an (
N × N)subblockmatrix.

(2) Take the element-wise FFT of C(r1r), where the element
vectors
Fn1,n2 = FFT{E(nc1),n2} and E(nc1),n2 (i) = Cr(1r)[(n1 -
i - 1) * N + n2 - 1] for i [0,LF], n 1,n2 [1,N].

(3) For m [1, M ]andn [1, N ], compute the
dimension-wise FFT of the channel estimation as
Φm = (D 0 I)hm = FFT([0, ... ,0, hm,n(L),... , hm,n(0),
0,
...,0]).

(4) Compute the inverse of the block-diagonal matrix F,
where
F-1 = diag(F[0]-1, ..., F[LF - 1]-1) and F[i] is
an (
N × N) submatrix formed from the ith subcarrier
of
Fn1,n2.

(5) Compute the matrix multiplication of the submatrices
inverse with the FFT output of channel estimation co-
e
fficients Ψm = F-1 Φm .

(6) Compute the dimension-wise IFFT of the multiplica-
tion results
wopt (Dh 0 I)Ψm.

With a timing- and data-dependency analysis, the top-
level block diagram for the MIMO equalizer is shown in
Figure 2. The system-level pipeline is designed for bet-
ter modularity. In the front end, a correlation estimation
block takes the multiple-input samples for each chip to
compute the covariance coe
fficients of the first column of
Rrr. It is made circulant by adding corner to form the
matrix [
E[0], ... , E[L], 0, ... , 0, E[L]H , ... , E[1]H ]. The com-
plete coe
fficients are written to DPRAMs and the (N × N)
element-wise FFT module computes [
F[0], ..., F[LF ]] =
FFT{E[0],...,E[L],0,...,0,E[L]H ,...,E[1]H }.

Another parallel data path is for the channel estimation
and the (
M × N) dimension-wise FFTs on the channel coef-
ficient vectors as in (
D 0 I)hm. A submatrix inverse and mul-
tiplication block takes the FFT coe
fficients of both the chan-
nel estimation and correlation estimation coe
fficients from
DPRAMs and carries out the computation as in
F-1Φm. Fi-
nally, an (
M × N) dimension-wise IFFT module generates
the results for the equalizer taps
wopt and sends them to the




More intriguing information

1. The Impact of Individual Investment Behavior for Retirement Welfare: Evidence from the United States and Germany
2. The effect of classroom diversity on tolerance and participation in England, Sweden and Germany
3. Global Excess Liquidity and House Prices - A VAR Analysis for OECD Countries
4. The name is absent
5. The name is absent
6. Examining the Regional Aspect of Foreign Direct Investment to Developing Countries
7. Behavior-Based Early Language Development on a Humanoid Robot
8. CONSUMER PERCEPTION ON ALTERNATIVE POULTRY
9. Valuing Farm Financial Information
10. ENERGY-RELATED INPUT DEMAND BY CROP PRODUCERS