FASTER TRAINING IN NONLINEAR ICA USING MISEP



Submitted toICA’2003

FASTER TRAINING IN NONLINEAR ICA USING MISEP

Luls B. Almeida

INESC-ID, R. Alves Redol, 9, 1000-029 Lisboa, Portugal

[email protected]

ABSTRACT

MISEP has been proposed as a generalization of the INFO-
MAX method in two directions: (1) handling of nonlinear
mixtures, and (2) learning the nonlinearities to be used at
the outputs, making the method suitable to the separation
of components with a wide range of statistical distributions.
In all implementations up to now, MISEP had used multi-
layer perceptrons (MLPs) to perform the nonlinear ICA op-
eration. Use of MLPs sometimes leads to a relatively slow
training. This has been attributed, at least in part, to the
non-local character of the MLP’s units. This paper investi-
gates the possibility of using a network of radial basis func-
tion (RBF) units for performing the nonlinear ICA opera-
tion. It shows that the local character of the RBF network’s
units allows a significant speedup in the training of the sys-
tem. The paper gives a brief introduction to the basics of the
MISEP method, and presents experimental results showing
the speed advantage of using an RBF-based network to per-
form the ICA operation.

1. INTRODUCTION

Linear independent components analysis (ICA) is becoming
a well researched area. Its nonlinear counterpart (nonlinear
ICA) is much less researched, but interest in this area has
been increasing, e.g. [1, 2, 3, 4, 5, 6]. In this paper we deal
with a method for performing nonlinear ICA which is an
extension of INFOMAX, called MISEP [7, 6, 8].

MISEP extends the well known INFOMAX method in
two ways: (1) it is able to perform nonlinear ICA, and (2)
it uses adaptive nonlinearities at the outputs. These nonlin-
earities are intimately related to the statistical distributions
of the components, and the adaptivity allows the method to
deal with components with a wide range of distributions.

As originally proposed, MISEP could use any parame-
terized, linear or nonlinear network to perform the ICA op-
eration. However, all previous implementations have used
multilayer perceptrons (MLPs) to perform that operation.
This has sometimes resulted in a relatively slow learning.

This work was partially supported by Praxis project P/EEI/14091/1998
and by the European IST project BLISS.

In this paper, after a brief introduction to MISEP, we dis-
cuss the possible causes of this slowness, conjecturing that
it is due, at least in part, to the nonlocal character of the
MLP’s units. We test this conjecture by comparing systems
based on MLPs with systems based on radial basis function
(RBF) units, which have a local character. The experimen-
tal results confirm the validity of this conjecture. They also
show that, while the MLP-based systems could usually per-
form a good separation without the use of any explicit form
of regularization, the RBF-based ones do need an explicit
regularization.

The paper is organized as follows. Section 2 gives a
brief introduction to the MISEP method. Section 3 dis-
cusses the causes of the slow learning that is sometimes
observed. Section 4 describes the alternate implementation
based on RBF units and presents experimental results, and
Section 5 concludes.

2. THE MISEP METHOD

In this section we briefly summarize the MISEP method
for linear and nonlinear ICA. Given observation vectors
o,
drawn from an unknown distribution, MISEP tries to find
a transformation
y = F(o) (where o and y have the same
dimension
n), such that the components of y are as inde-
pendent as possible, according to a mutual information cri-
terion. The mutual information of the components of
y is
defined as

I(y) =XH(yi)-H(y),           (1)

i

H(y)=-


p (y)log p (y) dy,


(2)


where H denotes Shannon’s entropy, for discrete variables,
or Shannon’s differential entropy,
for continuous variables,
p(.) denoting the probability den-
sity of the random variable
y. The mutual information I(y)
is non-negative, and is zero only if the components of
y are
mutually statistically independent. It is known to be a good
independence criterion for ICA.



More intriguing information

1. The name is absent
2. Towards Learning Affective Body Gesture
3. The name is absent
4. THE ECONOMICS OF COMPETITION IN HEALTH INSURANCE- THE IRISH CASE STUDY.
5. Unilateral Actions the Case of International Environmental Problems
6. 09-01 "Resources, Rules and International Political Economy: The Politics of Development in the WTO"
7. The migration of unskilled youth: Is there any wage gain?
8. Chebyshev polynomial approximation to approximate partial differential equations
9. Non Linear Contracting and Endogenous Buyer Power between Manufacturers and Retailers: Empirical Evidence on Food Retailing in France
10. Return Predictability and Stock Market Crashes in a Simple Rational Expectations Model
11. Testing the Information Matrix Equality with Robust Estimators
12. Are Japanese bureaucrats politically stronger than farmers?: The political economy of Japan's rice set-aside program
13. Foreign direct investment in the Indian telecommunications sector
14. Does adult education at upper secondary level influence annual wage earnings?
15. THE DIGITAL DIVIDE: COMPUTER USE, BASIC SKILLS AND EMPLOYMENT
16. Automatic Dream Sentiment Analysis
17. The name is absent
18. WP RR 17 - Industrial relations in the transport sector in the Netherlands
19. Critical Race Theory and Education: Racism and antiracism in educational theory and praxis David Gillborn*
20. Fortschritte bei der Exportorientierung von Dienstleistungsunternehmen