3.2 Autonomous mental development
(AMD)
An agent can perform one, multiple or an open num-
ber of tasks. The task here is not restricted by type,
scope, or level. Therefore, a task can be a subtask of
another. For example, making a turn at a corner or
navigating around a building can both be a task.
To enable an agent to perform certain tasks,
the traditional paradigm involves developing task-
specific architecture, representation, and skills
through human hands, which we call it a “manual”
development. The manual paradigm has two phases,
the manual development phase and the automatic
execution phase. In the first phase, a human devel-
oper H is given a specific task T to be performed
by the machine and a set of ecological conditions Ec
about operational environment. The human devel-
oper first understands the task. Next, he designs
a task-specific architecture and representation and
then programs the agent A. In mathematical nota-
tion, we consider a human as a (time varying) func-
tion that maps the given task T and the set of eco-
logical conditions Ec to agent A:
A=H(Ec,T). (1)
In the automatic execution phase, the machine is
placed in the task-specific setting. It operates by
sensing and acting. It may learn, using sensory data
to change some of its internal parameters. However,
it is the human who understands the task and pro-
grams the internal representation. The agent just
runs the program.
Correspondingly, the autonomous development
paradigm has two different phases, the construction
and programming phase and the autonomous devel-
opment phase.
In the first phase, tasks that the agent will end up
learning are unknown to the robot programmer. The
programmer might speculate some possible tasks,
but writing a task-specific representation is not pos-
sible without actually given a task. The ecologi-
cal conditions under which the robot will operate,
e.g., land-based or underseas, are provided to the
human developer so that he can design the agent
body appropriately. He writes a task-nonspecific
program called developmental program, which con-
trols the process of mental development. Thus the
newborn agent A(t) is a function of a set of ecological
conditions only, but not the task:
A(0) = H(Ec), (2)
where we added the time variable t to the time vary-
ing agent A(t), assuming that the birth time is at
t = 0.
After the robot is turned on at time t = 0, the
robot is “born” and it starts to interact with the
≡
і Given to
Place in the setting
I
Agent
Turn on
Sense, act, sense, act, sense, act
Manual
development
phase
Automatic
execution
phase
(a)
I Task 1 I I Task 2 ∣
Given to
Given to
Turn off
Time
I Task n j ■■■
Given to
to
Ecological
conditions
Release
Construction &
programming
Q
∕∖
Q
∕∖
Turn
, on
^Training
Training
I Agent ∣~
Sense, act, sense, act, sense, act
Autonomous
development
(b)
^Training
Time
Figure 4: Manual development paradigm (a) and au-
tonomous development (b) paradigm .
physical environment in real time by continuously
sensing and acting. This phase is called autonomous
development phase. Human teachers can affect the
developing robot only as a part of the environment,
through the robot’s sensors and effectors. After the
birth, the internal representation is not accessible to
the human teacher.
Various learning modes are available to the teacher
during autonomous development. He can use su-
pervised learning by directly manipulating (compli-
ant) robot effectors (see, e.g., (Weng et al., 1999)),
like how a teacher holds the hand of a child while
teaching him to write. He can use reinforcement
learning by letting the robot try on its own while
the teacher encourages or discourages certain ac-
tions by pressing the “good” or “bad” buttons in
the right context (see, e.g., (Weng et al., 2000a)
(Zhang and Weng, 2001b)). The environment it-
self can also produce reward directly. For ex-
ample, a “sweet” object and a “bitter” one (see,
e.g., (Almassy et al., 1998)). With multiple tasks in
mind, the human teacher figures out which learn-
ing mode is more suitable and efficient and he typi-
cally teaches one task at a time. Skills acquired early
are used later by the robot to facilitate learning new
tasks.
Fig. 4 illustrates the traditional manual develop-
ment paradigm and the autonomous development
paradigm.