How does an infant acquire the ability of joint attention?: A Constructive Approach

2.2.2 Visual feedback controller

The visual feedback controller receives the detected
image feature of the object i_trg and outputs the mo-
tor command ^{V F} ∆θ for the camera head to attend
to itrg. First, this controller calculates the ob ject po-
sition (xi, yi) in the camera image. Then, the motor
command ^{V F} ∆θ is generated as

VFδθ=μ ^v_v^f_f∆^pp^an ʌ=g μ xi - ^cχ ʌ, (2)
∆θtilt yi ^-^cy

where g is a scalar gain and (cx, cy) denotes the
center position of the image. The motor command

VF_∆_θ

is sent to the gate as the output of the visual
feedback controller.

As described above, visual attention that is one of
the robot’s embedded mechanisms is performed by
the salient feature detector and the visual feedback
controller.

2.2.3 Internal evaluator

The other embedded mechanism that is learning with
self-evaluation is realized by the internal evaluator
and the learning module.

The internal evaluator drives the learning mecha-
nism in the learning module when the following con-
dition is met:

√(xi - cx)² + (Уі - cy)² < dth, (3)

where dth is a threshold for evaluating whether the
robot watches an object in the center of the camera
image or not. Note that the internal evaluator does
not know whether joint attention has succeeded or
failed but knows whether visual attention has done.

2.2.4 Learning module

The learning module consists of a three-layered neu-
ral network. In the forward processing, this mod-
ule receives the image of the caregiver’s face and the
angle of the camera head θ as inputs, and outputs
LM_∆_θ

as a motor command. The caregiver’s face
image is required to estimate the motor command
^LM∆θ to follow the caregiver’s gaze direction. The
angle of the camera head θ is utilized to move the
camera head incrementally because the caregiver’s
attention cannot be narrowed down to a particular
point along the line of the caregiver’s gaze. The gen-
erated motor command ^LM∆θ is sent to the gate as
the output of the learning module.

In the learning process, this module learns sen-
sorimotor coordination by back propagation when
it is triggered by the internal evaluator. As men-
tioned above, the internal evaluator drives the learn-
ing module according to the success of visual atten-
tion, not joint attention, this module has correct and
incorrect learning data for joint attention. In the
former case, the learning module can acquire the ap-
propriate correlation between the inputs, the care-
giver’s face image and θ, and the output ∆θ. On
the other hand, in the latter case, this module can-
not find the appropriate sensorimotor coordination.
However, the learning module is expected to statisti-
cally lose the incorrect data as outliers as described
in 2.1 while the learned sensorimotor coordination
of the correct data survives in the learning module.
As a result, the survived correlation in the learning
module allows the robot to realize joint attention.

2.2.5 Gate

The gate arbitrates the motor command ∆θ be-
tween ^VF∆θ from the visual feedback controller and
^LM∆θ from the learning module. The gate sets a
gating function to define the selecting rate of the
outputs. In the beginning of the learning, the select-
ing rate of ^VF∆θ is set to a high probability because
the learning module has not acquired the appropri-
ate sensorimotor coordination for joint attention yet.
On the other hand, in the latter stage of the learning,
the output ^LM∆θ from the learning module, which
has acquired the sensorimotor coordination for joint
attention, becomes more probable to be selected. As
a result, the robot can increase the proportion of the
correct learning situations according to the learning
advance. It allows the learning module to acquire
more appropriate sensorimotor coordination for joint
attention.

2.3 Incremental learning

It is expected that the proposed model makes the
robot acquire the ability of joint attention through
the following incremental learning process.

stage I: In the beginning of the learning, the robot
has a tendency to attend to an interesting object
in the field of the robot’s view based on the em-
bedded mechanism of visual attention since the
gate mainly selects ^VF∆θ. At the top of Fig-
ure 4, the robot outputs ^VF¹ ∆θ or ^VF²∆θ case
by case and watches one object regardless of the
direction of the caregiver’s attention. At the same
time, the robot begins to learn the sensorimotor
coordination in each case.

stage II: In the middle stage of the learning, the
robot is able to realize joint attention owing to
the learning in stage I if the object that the care-
giver attends to is observed in the robot’s first
view. At the middle left of Figure 4, the learning
module has acquired the sensorimotor coordina-
tion of ^LM¹ ∆θ because only that of ^VF¹ ∆θ had
the correlation in stage I.

More intriguing information

1. Wage mobility, Job mobility and Spatial mobility in the Portuguese economy
2. The name is absent
3. Examining Variations of Prominent Features in Genre Classification
4. Peer Reviewed, Open Access, Free
5. Skills, Partnerships and Tenancy in Sri Lankan Rice Farms
6. Backpropagation Artificial Neural Network To Detect Hyperthermic Seizures In Rats
7. WP RR 17 - Industrial relations in the transport sector in the Netherlands
8. ModellgestÃ¼tzte Politikberatung im Naturschutz: Zur â€žoptimalenâ€œ FlÃ¤chennutzung in der Agrarlandschaft des BiosphÃ¤renreservates â€žMittlere Elbeâ€œ
9. Cardiac Arrhythmia and Geomagnetic Activity
10. SAEA EDITOR'S REPORT, FEBRUARY 1988