FASTER TRAINING IN NONLINEAR ICA USING MISEP



Submitted toICA’2003

FASTER TRAINING IN NONLINEAR ICA USING MISEP

Luls B. Almeida

INESC-ID, R. Alves Redol, 9, 1000-029 Lisboa, Portugal

[email protected]

ABSTRACT

MISEP has been proposed as a generalization of the INFO-
MAX method in two directions: (1) handling of nonlinear
mixtures, and (2) learning the nonlinearities to be used at
the outputs, making the method suitable to the separation
of components with a wide range of statistical distributions.
In all implementations up to now, MISEP had used multi-
layer perceptrons (MLPs) to perform the nonlinear ICA op-
eration. Use of MLPs sometimes leads to a relatively slow
training. This has been attributed, at least in part, to the
non-local character of the MLP’s units. This paper investi-
gates the possibility of using a network of radial basis func-
tion (RBF) units for performing the nonlinear ICA opera-
tion. It shows that the local character of the RBF network’s
units allows a significant speedup in the training of the sys-
tem. The paper gives a brief introduction to the basics of the
MISEP method, and presents experimental results showing
the speed advantage of using an RBF-based network to per-
form the ICA operation.

1. INTRODUCTION

Linear independent components analysis (ICA) is becoming
a well researched area. Its nonlinear counterpart (nonlinear
ICA) is much less researched, but interest in this area has
been increasing, e.g. [1, 2, 3, 4, 5, 6]. In this paper we deal
with a method for performing nonlinear ICA which is an
extension of INFOMAX, called MISEP [7, 6, 8].

MISEP extends the well known INFOMAX method in
two ways: (1) it is able to perform nonlinear ICA, and (2)
it uses adaptive nonlinearities at the outputs. These nonlin-
earities are intimately related to the statistical distributions
of the components, and the adaptivity allows the method to
deal with components with a wide range of distributions.

As originally proposed, MISEP could use any parame-
terized, linear or nonlinear network to perform the ICA op-
eration. However, all previous implementations have used
multilayer perceptrons (MLPs) to perform that operation.
This has sometimes resulted in a relatively slow learning.

This work was partially supported by Praxis project P/EEI/14091/1998
and by the European IST project BLISS.

In this paper, after a brief introduction to MISEP, we dis-
cuss the possible causes of this slowness, conjecturing that
it is due, at least in part, to the nonlocal character of the
MLP’s units. We test this conjecture by comparing systems
based on MLPs with systems based on radial basis function
(RBF) units, which have a local character. The experimen-
tal results confirm the validity of this conjecture. They also
show that, while the MLP-based systems could usually per-
form a good separation without the use of any explicit form
of regularization, the RBF-based ones do need an explicit
regularization.

The paper is organized as follows. Section 2 gives a
brief introduction to the MISEP method. Section 3 dis-
cusses the causes of the slow learning that is sometimes
observed. Section 4 describes the alternate implementation
based on RBF units and presents experimental results, and
Section 5 concludes.

2. THE MISEP METHOD

In this section we briefly summarize the MISEP method
for linear and nonlinear ICA. Given observation vectors
o,
drawn from an unknown distribution, MISEP tries to find
a transformation
y = F(o) (where o and y have the same
dimension
n), such that the components of y are as inde-
pendent as possible, according to a mutual information cri-
terion. The mutual information of the components of
y is
defined as

I(y) =XH(yi)-H(y),           (1)

i

H(y)=-


p (y)log p (y) dy,


(2)


where H denotes Shannon’s entropy, for discrete variables,
or Shannon’s differential entropy,
for continuous variables,
p(.) denoting the probability den-
sity of the random variable
y. The mutual information I(y)
is non-negative, and is zero only if the components of
y are
mutually statistically independent. It is known to be a good
independence criterion for ICA.



More intriguing information

1. Activation of s28-dependent transcription in Escherichia coli by the cyclic AMP receptor protein requires an unusual promoter organization
2. THE CHANGING RELATIONSHIP BETWEEN FEDERAL, STATE AND LOCAL GOVERNMENTS
3. Unilateral Actions the Case of International Environmental Problems
4. NATIONAL PERSPECTIVE
5. The Values and Character Dispositions of 14-16 Year Olds in the Hodge Hill Constituency
6. Social Irresponsibility in Management
7. The name is absent
8. Housing Market in Malaga: An Application of the Hedonic Methodology
9. THE CO-EVOLUTION OF MATTER AND CONSCIOUSNESS1
10. ESTIMATION OF EFFICIENT REGRESSION MODELS FOR APPLIED AGRICULTURAL ECONOMICS RESEARCH
11. Peer Reviewed, Open Access, Free
12. Social Balance Theory
13. Equity Markets and Economic Development: What Do We Know
14. The name is absent
15. THE USE OF EXTRANEOUS INFORMATION IN THE DEVELOPMENT OF A POLICY SIMULATION MODEL
16. Spousal Labor Market Effects from Government Health Insurance: Evidence from a Veterans Affairs Expansion
17. The constitution and evolution of the stars
18. The name is absent
19. The name is absent
20. The name is absent