AARE2002: Paper CHO02101
given to the categories must be appropriate. The number of categories must also be ap-
propriate for the intended research.
It is clearly important that the classification of behaviour is performed as accurately as pos-
sible. It is especially vital when different sets of behaviour are to be compared quantitatively,
such as, for example, when comparing two groups of subjects, or analysing the change in be-
haviour of a single group of subjects over time. If there is a significant variation in the classifi-
cation of behaviour, the quantitative measures of behaviour will vary, even if the observed be-
haviour remains the same; this is more likely to be a problem when different workers perform
the actual classification for different sets of data. Interobserver consistency is not always simply
achieved - a great deal of time and effort can be expended on training the observers in order
to maximise consistency in classification by them (Meltzoff, 1998). Even if a single observer
performs all of the classification in question, consistency over time is still vital. The importance
of consistency is widely recognised, and inter-observer agreement (or inter-observer reliability,
although strictly not a measure of reliability) is generally measured (Barlow & Hersen, 1984;
Meltzoff, 1998; Mitchell & Jolley, 2001; Whitley, 2002).
The consistency achieved in classification, whether by a single observer or multiple ob-
servers is likely to depend on the method used for the act of classification. The literature on
how the researcher can decide into which category an observed behaviour falls is virtually non-
existent. Nevertheless, this is obviously an issue of no small importance - the reduction of raw
observation to quantitative data, and the analysis thereof, cannot proceed without it. The most
common method in use appears to be for the researcher to refer to a list of definitions of the cat-
egories. Observation of this method in practical use shows that it is far from ideal. If the proper
category is not immediately obvious, then the definitions of all the plausible categories need
to be re-read, the behaviour re-observed, and so on, until a choice can be made. A great deal
of difficulty results from ambiguous behaviours that appear to fit multiple categories. How
can such behaviours be consistently classified? While these problems are usually minimised
if the same researcher who devised the classification scheme is the observer who quantifies
(“codes”) the observed behaviours, in practice, much of the coding will be performed by mul-
tiple research assistants. Given the importance of inter-observer consistency, the need for a
simple and reliable method for classification that will maximise consistency is obvious.
We note that the problem of easy, accurate, and consistent classification is general and multi-
disciplinary - classification decisions are important in many fields (Payne & Preece, 1980). One
field where the problem of classification is critical is biological taxonomy. Organisms must be
able to be classified correctly, even by workers with little training in classification or experience
with organisms of the type in question. One of the standard tools designed to make this possi-
ble is the binary key (or dichotomous key), an identification key where decisions are made one at
a time, and each question asked of the user of the key has only two possible answers. A simple
illustrative example of such a key is shown in figure 1.
The most important feature of identification keys such as the one shown in figure 1 is that
decisions are made one at a time. Each decision is much simpler than if all of the required
decisions were grouped together, and had to be made at once (for example, as occurs when
classifying by referring to a list of categories). Therefore, each simple decision is faster and more
accurate, and as long as the number of decisions to be made is not too large, an identification
key can be faster to use than a list of categories. The simplification is especially important in
ambiguous cases - the classifier can concentrate on the single feature that divides the decision
path, rather than having to simultaneously consider all observable characteristics.
These benefits of using binary keys for identification are not restricted to biological classifi-