|
Human computer interaction, or HCI, is concerned with the
way in which people communicate with computers, and the goals of HCI are to
produce usable, functional, reliable, and safe systems. When you choose
between two alternative brands of software (e.g., WordPerfect and Word for
Windows), you often make your choice on the basis of how easy it is to use
the software, rather than its power or efficiency. The way in which we
conduct a dialog with a computer is every bit as important as the hardware
itself.
Much of the glamour in computing has been grabbed by
advances in computer hardware and software. Hardware has been getting
faster, cheaper, and more powerful year by year. Scarcely a month goes by
without the release of a major upgrade to an operating system, word
processor, spread sheet, database, or compiler. HCI is the Cinderella of
computing and often has had to wait at home while the hardware and software
go to the ball. Today, people are paying more attention to human computer
interaction because it determines how efficiently and how reliably we can
use a computer. In this chapter we are going to introduce some of the
fundamental concepts of HCI and describe the ingredients of an effective
interface.
Few of the elements of computer science involve a single
discipline; for example, the operating system is the point at which hardware
and software meet. HCI is probably the most multidisciplinary element of
computer science and is the point at which hardware, software
engineering and human psychology all meet. It’s impossible to
design an effective interface between a human being and a computer without
taking account of the human’s characteristics. For example, there’s little
point in displaying text on a screen faster than you can read it. If an
image is animated, the motion must be swift enough to avoid boredom and slow
enough to avoid jerkiness. Similarly, if the message from the computer is
too complex and detailed, the human operator will neither be able to
understand it nor remember it.
The Importance of HCI
Why should we be so concerned about human computer
interaction? Computers, or any other machines, do exactly what we
tell them. It is therefore vital that the communication between the human
being and the machine be both efficient and unambiguous. I can
think of few interfaces worse than that between the human being and a
typical video recorder, VCR. When I go on vacation it takes me ages to
program my VCR—I have to step through the channels, days, hours and minutes
(for both program on times and program off times) for each of the eight
programs I wish to record. Why can’t I enter the name of the program I want
to watch from a keyboard? Hasn’t the VCR learned by now what programs
I like?
Consider another example of a poorly designed
human-machine interface. On 5 July 1970 a DC-8 aircraft was approaching
Toronto International Airport with over 100 passengers and crew on board. At
60 feet above the runway the co-pilot moved the spoiler control lever
in readiness for the landing. Spoilers are large flat metal plates hinged to
the upper surface of an aircraft’s wings. When spoilers are deployed
immediately after landing, they swing up into the airstream and destroy the
wings’ lift. Spoilers ensure that the aircraft‘s transition from flying to
taxiing is clean. The spoiler control in the DC-8 has two active
positions—if you lift the lever the spoilers are armed and deployed
automatically after the aircraft touches down; if you pull the lever
the spoilers are deployed immediately.
At 60 feet above the runway the spoiler control lever in
the DC-8 was pulled. The spoilers were deployed and the wings
immediately lost their lift. The passengers and crew lost their lives. The
initial response to the incident was to suggest that a placard be placed
alongside the spoiler lever with the caption Deployment in Flight
Prohibited. Perhaps they should have written Please do not crash this
aircraft. It took two more tragedies before the spoiler control was
modified to prevent inadvertent operation in flight.
Although this dramatic example doesn’t involve computers,
it demonstrates what can happen when the designer creates an interface that
doesn’t take account of how a real user might operate it under all
circumstances.
Poor interface design doesn’t necessarily cause errors as
such. But what price frustration? I once used a desktop publishing
package to produce a book. This package could easily modify the font of
all the text in a whole paragraph, but it couldn’t easily change the
font of a single word in a paragraph. Because I was using a special
font for any assembly language words embedded in a paragraph, I had to go
through three pull-down menus to make each and every font change—and there
were thousands. A later version of this software has solved this problem.
Jacob Nielsen (Iterative User-Interface Design, Computer, November 1993)
describes the construction of a home banking system that yielded a 242%
improvement when the user interface was tested and improved over three
versions.
A heartfelt comment on the user interface is made by
Baecker and Buxton in their introduction to Readings in Human-Computer
Interaction:
"Yet despite its importance, the user interface is
one of the most poorly understood aspects of any system. Its success or
failure is determined by a complex range of poorly understood and subtly
interrelated issues, including whether the system is congenial or
hostile, easy or difficult to learn, easy or difficult to use,
responsive or sluggish, forgiving or intolerant of human error".
We are going to look at both the physical and the
logical interface. The physical interface encompasses the hardware
used in the dialog between the human and the computer. The logical interface
describes the way in which a user employs the physical interface to engage
in a dialog with the machine; for example, a programmer might use a mouse
and a display (the physical interface) to select items from a menu on the
screen (the logical interface). A study of the way in which computers and
humans interact helps user interface designers to create systems that make
best use of the available technology, enhance the user’s productivity, and
reduce the number of errors they make. In short, the goal of HCI is to
increase the usability of a computer and its software.
The Physical Interface
Humans communicate with each other, principally, by
auditory and visual stimuli; that is, we speak, gesticulate, write to each
other, and use pictures. You would therefore expect humans and computers to
communicate in a similar way. Computers are fairly good at communicating
with people; they can generate sophisticated images, although they are
rather less good at synthesizing natural-sounding speech. Unfortunately,
computers cannot yet receive visual or sound input directly from people.
Hardware and software capable of reliably understanding speech or
recognizing visual input does not yet exist—there are systems that can
handle speech input and systems that can recognize handwriting, but the
error rate is still too large for general-purpose use. Consequently, people
communicate with computers in a different way than they communicate with
other people. We communicate with computers largely by means of the keyboard
and pointing devices like the mouse.
The Keyboard
The keyboard is still the most commonly used means of
getting data into a computer. Sometimes, the keyboard is an efficient
interface—especially when you’re banging in text with two or more fingers.
Sometimes the keyboard is a positively horrible interface—like when you are
trying to use it to control a flight simulator. In this section we are going
to look at the characteristics of the keyboard as a computer input device.
Consider the layout of the ubiquitous QWERTY
keyboard. The term QWERTY isn’t an acronym but the sequence of letters on
the back row of characters on a keyboard. When the first mechanical
typewriters were constructed, the sequence of letters were chosen to reduce
the probability of letters jamming. For example, if t and h
were next to each other, typing the would sometimes cause the letters
t and h to collide and jam. As you can imagine, the
anti-jamming property of the QWERTY keyboard is optimum only for the English
language.

Layout of the QWERTY keyboard
Now that keyboards are electronic and have no moving
parts except the keys themselves, there is no longer a need for a QWERTY
layout. A better layout would make it easier to type English by reducing the
distance a typist’s fingers have to move on average. The Dvorak
keyboard was developed in the 1920’s to make it easier to type English—it is
also biased towards right-handed typists. Studies demonstrate that a typist
can achieve a 10 to 15% improvement when using a Dvorak keyboard.
Unfortunately, so many typists and programmers have been trained on the
QWERTY keyboard, that it would be very time-consuming to retrain them to use
a new layout. The Dvorak keyboard has therefore failed to topple the QWERTY
standard.
Some systems designed for infrequent computer users and
non-typists have a simple ABCDE keyboard in which the keys are laid out in
alphabetic order—this keyboard makes it easy for users to locate keys, but
prevents experienced users entering data rapidly (because they will have
been trained on a QWERTY layout). As you can see, there is sometimes a
trade-off between the needs of the experienced user and the inexperienced
user.
A radically different form of keyboard is the chord
keyboard that has only a few (typically 4 or 5) keys. You enter a letter by
hitting a subgroup of keys simultaneously; it’s rather like using Morse code
or Braille. The chord keyboard is very small indeed and can be used with one
hand. Chord keyboards have found a niche market for people who operate in
cramped spaces or for those who want a pocket-sized device that they can use
to make notes as they move about.
Special Purpose Keys
In order to provide the total number of keys necessary
for efficient computer operation a keyboard would have to be gigantic. In
practice, most keys have a multiple function; that is, the meaning of a
given key can be modified by pressing another key at the same time. The
shift key selects lower case characters as the default mode, and upper
case characters when it’s pressed at the same time as a letter. The shift
key also selects between pairs of symbols that share a key (e.g., : and ;, @
and ‘, + and =, etc) and between numbers and symbols (e.g., 4 and $, 5 and
%, 8 and *, etc).
Although the layout of the letters and numbers on a
QWERTY keyboard is standard throughout the English-speaking world, the
layout of other keys (e.g., symbols) is not—in particular, there is a
difference between keyboards designed for use in the USA and those designed
for use in the UK. Consequently, software has to be configured for the
specific version of the keyboard currently in use.
Modern computer keyboards also include a control, Ctrl,
key that behaves like a shift key. The control key gives a key a different
meaning when control is pressed at the same time. Computer books indicate
the act of pressing the control key and, say, the letter D at the same time
by the notation CTRL-D.
Why do we need all these special keys? When we
communicate with a computer we need to provide it with two types of entries.
One is the information or data the computer is going to process (e.g., the
text entered into a word processor, or a booking entered into an airline’s
database). The other type of information entered into a computer is the
commands that you want it to execute. Suppose that you are entering text
into a word processor and wish to save the file. You can’t simply type
Save file because the computer cannot distinguish between the command
you want to carry out and the words you are entering into the document. By
typing, for example, CTRL-S, you unambiguously are telling the computer that
you are entering the command to save a file.
PCs go one step further and provide an alt
(alternative) key to give yet another set of meanings to the keys.
Consequently, you can enter a key unshifted, with shift, control, alternate,
or any combination of the three function-modifier keys. Computer programs
that make extensive use of function control keys to enter commands often
provide the user with a plastic template that is attached to the
keyboard to remind him or her the meaning of the various control character
combinations.
In addition to the use of the shift, control, and
alternative keys, the PC keyboard contains 12 special-purpose function
keys labeled F1 to F12 that can be used to perform special functions.
Moreover, these functions are modified by the shift, control, and alternate
keys. Finally, keyboards have several dedicated keys like home,
end, PgDn, PgUp, Del, Ins, and so on.
The use of the shift, control, alternate, and function
keys, makes it much easier to communicate with a computer. All you have to
do is to remember the special codes; there can’t be more than a few
hundred of them...... Moreover, the way in which keys are assigned to
control functions differs from one application to another. When applications
writers adhere to a common control key usage (e.g., function key F1 is
normally used to invoke a program’s Help function), the user finds it
easy to switch between applications from different manufacturers. When an
applications programmer assigns control keys in an idiosyncratic way, it
becomes very difficult to switch applications because you keep hitting the
wrong key.
Computer displays invariably have a cursor—a
marker on the screen indicating the currently active position; that
is, if you enter a character, it will appear at the position indicated by
the cursor. Cursors can be vertical or horizontal lines, small blocks,
highlighted text, or even reversed text (i.e., white-on-black). Modern
applications frequently make use of several different types of cursor; for
example, a solid line indicates where text can be entered, an arrow points
at a command, and a cross indicates the edge of a picture or a table. Nearly
all cursors blink because human vision can more easily detect a change in a
static picture). Computer keyboards also contain four cursor control
keys. These keys move the cursor on the screen up, down, left, or right, by
one unit—either a character position horizontally or a line position
vertically. These keys can also be used as a crude type of joystick or mouse
for systems that require a cursor to be moved anywhere within a certain
area.
There are many ways of designing a keyboard and several
technologies can be used to detect a keystroke (e.g., mechanical, magnetic,
capacitive, etc). The difference between keyboards is often a matter of cost
and personal preference—some typists prefer to hear a satisfying click when
they depress a key, others don’t. Important keys like enter, shift, control,
and space are often made larger than other keys to make it easy to hit them.
If you are a real sadist, keyboard design is just for you— you can guarantee
a maximum level of user misery by locating a key that has a potentially
destructive function (e.g., delete text) next to a normal key such as the
space bar. Good practice would ensure that it is difficult to enter a
potentially fatal command by accident. Consider the following two examples
of safe operation: you cannot start a VCR recording without pressing two
buttons simultaneously, and the master engine switches in some aircraft are
under a bar to ensure that you cannot switch an engine off accidentally.
Pointing Devices
Although the keyboard is an excellent device for
inputting text, it cannot be used efficiently as a pointing device,
to select an arbitrary point on the screen. The
three most popular pointing devices are the joystick, the mouse, and the
trackball.

Pointing devices
The Joystick
The joystick is so called because it mimics the
joystick used to control military aircraft and some light aircraft. This
device consists of a stick that can be moved simultaneously in a left-right
and front-back direction. The computer reads the position of the stick and
uses it to move a cursor on the screen in sympathy. You don’t look at the
joystick when moving it; you look at the cursor on the screen. Without this
visual feedback between the hand and the eye, people would not be able to
use this, or similar, pointing devices. Joysticks contain one or more
buttons that can be used to enter commands. The joystick is well suited to
computer games, particularly aircraft, because it mimics the behavior of a
real control column.
Although the joystick is similar to the mouse and
trackball, there is one difference. When the mouse and track ball are not
being moved, there is no signal from them and the computer unambiguously
interprets this as no input. However, the joystick continually transmits a
position, which means that it is very difficult to centralize or neutralize
its output. Consequently, joysticks often have a dead zone around
their neutral position. Until you move the joystick out of the dead zone,
the cursor on the screen doesn’t move.
The Mouse
The mouse is probably the most popular pointing device in
this group. A mechanical mouse consists of a housing and a ball that
rotates in contact with the surface of a desk—you could say that a mouse is
a ball-point pen on a large scale. As the mouse is dragged along the desk,
the ball rotates and circuitry in the mouse translates the movement of the
ball into a signal that can be read by the computer. An electronic mouse is
used on a special pad that has a grid of horizontal and vertical lines. The
mouse reflects a light beam off the grid and counts the lines crossed as the
mouse moves about the pad.
When the computer gets a signal from the mouse, it is
scaled and used to move a cursor within the screen (exactly like the other
pointing devices in this group). When the software needed to control a mouse
is installed, the user can choose the mouse’s sensitivity; that is, how much
the cursor on the screen moves for a given movement of the mouse. The
sensitivity chosen is a function of the user’s hand-eye coordination.
A modern mouse is comfortable to hold and can be used to
move the cursor rapidly to any point on the screen. Once the mouse is at the
correct point, you depress one of two (or possibly three) buttons that fit
naturally under your fingers as you move the mouse. Pressing a button
activates some pre-defined application-dependent function on the screen.
Typical mouse-based systems require you to click the button once to
select an application (i.e., highlight it), and twice to launch an
application (i.e., run it). Clicking a button twice in this way is called
double-clicking and is not always easy to perform because the interval
between the clicks must fall within a given range.
The Trackball
A trackball is an upside-down mouse—it remains stationary
on the desk and you rotate a 2" to 6" ball to move the cursor on the screen.
Unlike the mouse, the trackball requires no desk space and can be fitted on
the keyboard of a lap-top portable. Trackballs are often built in to
electronic equipment that requires an operator to select a point on a screen
(e.g., a target on a radar screen). Some computer users prefer the trackball
to the mouse.
The trackball or a tiny joystick is now routinely built
into laptop and notebook computers. Some manufacturers go a long way to make
the pointing device easy to use. Other manufacturers carefully position the
pointing device to ensure that it cannot be operated efficiently by a
left-handed user.
Other Input Devices
Other, less widely used, pointing devices are the touch
screen, the light-pen, and the tablet. By either coating a display screen
with transparent conductors or by using ultrasonic/infra-red beams, the
computer can detect the location of a finger on the surface of the screen.
Consequently, the screen can be used as an input device simply by touching
the point you want to activate. A typical system displays a menu of
commands. Touch-sensitive screens are still relatively expensive and are
found only in specialized applications. Moreover, the finger is a rather
course pointer and cannot be used as precisely as a mouse or joystick.
However, touch screens are useful when the operator has no computer
experience whatsoever (e.g., a user-controlled guide in a shopping mall).
The light-pen uses a stylus with a light-sensitive device
in its tip. When placed against the computer’s screen, the light-pen sends a
signal to the computer when the beam passes under it. The light-pen is just
a much more precise form of the touch screen and is cheaper to implement.
Sophisticated algorithms can be used to convert the light-pen’s movement
over the screen (i.e., handwriting) into an ASCII-encoded text format for
internal storage and manipulation. Unfortunately, it’s not easy to convert
the output from a light-pen into text, because the way in which one person
writes, say, a letter "a" is often different from the way another person
writes it.
Some computers do have a primitive form of speech input.
You speak into a microphone and the audio signal is sampled and turned into
digital form. Voice recognition systems require you to train the
system by first recording all the words it is to recognize. During the
training process, it builds up a pattern or template for each word it
is to recognize. When you later say a word, its digitized version is matched
against the patterns stored during the training process and the closest fit
selected. Computers cannot yet handle continuous speech reliably in
real-time. Moreover, once a computer has been trained, other operators
cannot use the system without retraining it, because human speech varies
widely depending on the gender, age, and regional accent of the speaker.
The Screen
The vast majority of general-purpose computers
communicate with human beings via a screen, which may be a conventional CRT
or a liquid crystal display. As we describe the screen in detail in the
chapter on computer graphics, we will not describe its construction here.
The screen is viewed by human beings. Consequently, the
way in which human visual perception operates is of interest to those
designing the visual interface. For example, we can see some colors better
than others; we cannot read text if it is too small nor can we read it
rapidly if it is too large. Colors themselves are described in terms of
three parameters: hue is determined by the wavelength of the light;
saturation is determined by the amount of white light present in the
color; and intensity is determined by the brightness of the color.
Objects on a screen are viewed against background objects—the luminosity of
an object in comparison with its background is called its contrast.
All these factors have to be taken into account when designing an effective
display.
Not only can we perceive light in terms of its hue (i.e.,
color), saturation (the depth of the color), and intensity (brightness), we
can also perceive size and depth in a two-dimensional image. If you examine,
for example, an application running under Windows, you will almost certainly
see buttons and sliders that look as if they are three-dimensional. There is
no need to make the buttons appear realistic, but the interface looks like
the control panel of a TV or a hi-fi system. By making the computer
interface resemble something we are already familiar with, we don’t have to
be trained in its use.
Having looked at the physical mechanisms we use to
tell the computer what we want to do, we are going to look at the logical
interface that determines how the human-computer dialog is to be formatted.
The Logical Interface
Communications, whatever their nature, should be clear
and unambiguous. In this section we look as some of the factor affecting the
way in which we structure the dialog between the human being and the
computer.
Once upon a time, when propeller-driven aircraft were
more common, an aircraft was taxiing from the stand to the runway. The
captain had noticed that the flight engineer was looking rather miserable,
so he turned to the engineer and said, "Cheer up, Charlie." Charlie, the
flight engineer, looked a little puzzled but knew that the captain’s word
was law. A few seconds later, the aircraft sank onto its belly and there was
a grinding sound as both propellers ripped into the pavement. "Gear up,
Captain," said Charlie.
Although a computer user can’t damage an aircraft, he or
she can still cause havoc. I once had a bright idea. When you use a desk-top
publishing package to prepare a book, you spend a lot of time designing the
lay-out of a chapter and writing macros or styles for all its
elements (i.e., headings, footers etc.). When you start a new chapter, you
can either copy all these design items across, or you can be really smart.
You open the chapter you’ve just written, delete all the text, and then save
it with a new chapter name. In this one simple operation, you’ve created the
framework for a new chapter that’s exactly like the previous chapter. This
process is repeated as you write each new chapter.
Creating an entire book using this technique is extremely
efficient. One day, at the point you’ve opened a previous chapter and
deleted all the text, the phone rings and someone asks a question. "No
problem," you say, "I’ll just close this file, and fetch the file you want."
Approximately one quarter of a second after you’ve hit the key that saves
your file, you remember that you didn’t rename it. The empty file you’ve
just saved has replaced the chapter you spent months typing. Fortunately,
you did remember to take a back up yesterday—didn’t you?
Charlie the flight engineer didn’t query the captain’s
orders because of the hierarchical command structure that once existed on a
flight deck. Should the computer have queried my actions when I replaced a
file with an empty file? Some users might regard an interface that lets you
do anything as positively dangerous because it provides no safety net. On
the other hand, an interface that asks you to confirm each operation becomes
very tedious to use. I once used a system that provided the dialog below.
Lower case text represents my input and upper case text represents the
computer’s response.
delete Clements_1
ARE YOU CERTAIN YOU WANT TO DELETE CLEMENTS_1? [Yes/No]
yes
DO YOU WANT TO KEEP CLEMENTS_1? [Yes/No]
no
When I told the system to delete a file, it asked me to
confirm the operation with the positive response "yes". Having confirmed
that I did wish to delete the file, it asked the same question again, but
expressed in the negative. In order to delete the file, the opposite
response, "No", had to be given. The system demands a different response to
each question to prevent the operator giving automatic responses to
questions (e.g., typing "yes" the moment you see a question).
Automatically querying important operations is not
foolproof. Some might argue that no safety net is better than a partial
safety net. Once you feel that the computer is protecting you, you lessen
your vigilance. An excessive reliance on the computer leading to a
relaxation of attentiveness is thought to have been the underlying reason
for more than one aircraft crash.
The Interface and The Obvious
There’s a common expression—one man’s meat is another
man’s poison. We could have said "One user’s obvious is another
user’s obscure." Those who have grown up with the computer soon learn
its characteristics and capabilities and they find it relatively easy to
adapt to new software and interfaces. Indeed, they are often oblivious to
the problems a system poses to the non-expert. Consider the mouse: you drag
the mouse down the desk to pull the cursor down the screen in response. What
happens when the mouse gets to the bottom of the desk? You pick it up and
move it back to the top of the desk, because the cursor on the screen moves
only when the mouse is in physical contact with the desk. A user who does
not know instinctively how the mouse works, stops at the end of the desk and
asks someone what they should do next.
The behavior of the mouse is not instinctive to all; if
you turn it sideways, the cursor will move horizontally as you move
the mouse vertically on the desk. Pressing the mouse button to select
an icon (i.e., a small symbol that represents an object or a function such
as: ???????????) on the screen in easy enough, but double-clicking to select
and launch an application probably takes as much effort as changing gear in
a car before the synchromesh gear box was invented (the gear change required
a complex operation involving the clutch and gas pedal called
double-declutching).
The meaning of the icons found on a typical Windows
screen are conceptually obvious. Well, they are if you have read the trade
press and computer literature for years and are familiar with the icons for
Microsoft or WordPerfect, or for databases and communication packages. Even
the meaning of common icons like the trash can is not always obvious to the
novice computer user. We are now going to look in more detail at the
characteristics of the computer user—the human.
The Human and the Computer
Computers are designed and programmed by humans for
humans, and, therefore, any problems associated with the computer-human
interface are minimal. Run that by me again..... Computers are designed and
programmed by some humans for use by other humans. That’s
better. We sometimes forget that we live in a particular stratum of a
particular culture, and that it strongly influences the way in which we view
the world. I once saw a discussion of music that posed the question, "Is our
response to a certain type of music inherent or is it culturally determined?
We were given a fragment of the soundtrack from a typical Hollywood film and
we all decided that the music was typical of the romance scene. Then we were
played a fragment of a music from a romantic scene in an oriental film. In
this case, the music was nothing like the Hollywood schmaltz and sounded
discordant rather than romantic to the Western ear. Things we take for
granted are, in fact, learned responses to stimuli.
The effect of culture on a computer interface is
particularly important because, unlike the case of the music, the division
is not into two groups such as East and West; it is much more multifaceted.
Some of the differences between computer users are:
Age A young person educated in a
technological society is likely to have a feel or empathy for
computer systems because some of the underlying themes are
familiar from the VCR timer, the pocket calculator, and the
games machine. An older person might have had little prior
contact with computer-like machines.
Cultural background The way we write,
from left to right is part of the Western culture. Other
societies write from right to left. A computer user in Europe
and a computer user in Asia might scan a screen for information
in a very different way. These cultural differences might
influence the position of important warning and error messages.
Linguistic background Even when the
computer designer and the computer user speak the same language,
problems can still arise because not all words and phrases are
used in the same way. In certain parts of the North of England
the word "while" means "until" (e.g., "I am
staying here while Friday" means "I’m staying here until
Friday). Similarly, a North American might speak of tabling
the plan, which would convey the opposite to someone from
England (in the USA to table something means to omit it, whereas
in England it means to include it). I’ve always felt that
British tourists to the USA should have the following message
stamped in their passports: "In the USA the device at the end of
a pencil used to make typographic corrections is called an
ERASER..."). Moreover, the spelling of words is sometimes
different in British English and American English, and the
presentation of dates is different in the USA and Europe. Some
software packages permit the user to select a particular subset
of English (e.g., UK, USA, AUS).
Experience Computer users can be roughly
divided into four groups: the novice or non-computer expert who
has little or no background knowledge of computing; the
occasional or intermittent user who uses the system in bursts
and forgets the details of the commands between sessions; the
experienced user who uses the system frequently, and the expert
user who has an in-depth understanding and will often accept a
terse user interface if it will provide short cuts and aid
productivity. Computer users change as their experience grows.
An interface that was optimum during the training phase, might
not be optimum when the user becomes more experienced.
Gender To a certain extent, boys and
girls are brought up in different cultural environments and
receive different messages from society. These environments may
affect the way in which men and women relate to the computer.
Ben Schneiderman in Designing the User Interface points
out that common computer commands like KILL and ABORT may have a
different effect on male and female computer users.
Disability Some computer users may suffer
from visual impairments such as retinitis pigmentosa. In
such cases, the actual design of the display might have a
profound effect on someone suffering from impaired vision.
Moreover, quite a large percentage of males are color blind. A
computer user might have a limited ability to use their hands
and fingers to manipulate a keyboard or a mouse. An interface
that requires few keystrokes or mouse movements might suit
someone with a mobility impairment more than an interface that
requires large and frequent movements of the mouse.
Handedness The majority of people are
right-handed. However, some physical interfaces are not well
suited to left-handed users (e.g., lap-top portables with a
built-in trackball on the right side of the keyboard).
Although it happens subconsciously, we build models of
the world to enable us to deal with new situations; for example, someone who
has learned to drive one model of car can normally drive a new model with
virtually no practice. This is because the way in which all cars operate
conforms to a set of basic rules. The mental database associated with a car
is very complex: we can calculate a car’s dynamics and road handling
performance, or handle virtually all its controls from acceleration to
turning, to indicating.
When we use a computer interface, we also build a model
of how the system works. This model covers both the syntactic and the
semantic aspects of the interface. Syntax describes the grammar or
the rules by which the interface operates; for example, the syntax of a
command to copy file A to location B might be Copy A,B. In a
graphical or windows environment, the syntax of the interface might be
expressed in terms of the mouse movement required to access a particular
menu item. You might say, that a knowledge of an interface’s syntax is
similar to a knowledge of the way in which the controls of a car are
operated.
Semantics is concerned with meaning, and covers,
for example, an understanding of the effects of controls. In terms of an
automobile, a driver with only a syntactic knowledge of the stick shift
would be able to change gear but not know why a gear change was necessary.
If the computer user understands an interface’s semantics but not its
syntax, he or she becomes frustrated because they know what they want to do
but not how to do it.
Any computer interface should be designed to have a
consistent syntax—indeed, a lack of consistency is one of the most annoying
aspects of almost any interface (computer or otherwise). If the syntax of
the copy command is Copy A,B and has the effect of copying the
contents of file A into file B, you would expect that the syntax of the
rename command, Rename A,B, to be take file A and rename it as file
B. If the syntax were take file B and rename it file A, the syntax would be
inconsistent with the copy command. A consistent syntax has two advantages;
first it reduces the user’s learning burden; second it reduces the danger of
error caused by the incorrect use of a command with an inconsistent syntax.
On the other hand, someone described Kai Tak airport in Hong Kong as one of
the world’s safest because it is so difficult to land there; that is, if an
operation is difficult to perform, you take greater care.
User syntax becomes most confusing when different
commands perform similar functions at different stages in a session.
Probably the most extreme case of this is the termination or completion
command. If you are using a menu driven interface, the carriage return often
terminates a command. If you are entering a text string, the operation might
be terminated by a special key such as escape (ESC) or control D (CTRL-D).
If the system accepts command words, the appropriate command might be Q
(quit), or QU, or the full QUIT, or it may be KILL, or EXIT... Anyone who
has used a new software package or operating system will be accustomed to
entering ever more obscure mnemonics in order to try to execute a forgotten
command.
Some software can be made difficult to use by providing
the user with too many ways of achieving the same goal. An airline pilot who
flew advanced aircraft with computerized systems told me that he could often
perform an action in one of several ways. However, he said that this degree
of freedom was sometimes counterproductive because he had to decide which
technique he was going to use.
The Interface Format
In this section we are going to look at some of the
widgets that have been designed to facilitate human-computer
interaction. A widget is a group of screen structures (e.g., menus, buttons,
scroll bars etc.) provided in a graphical interface.
The nature of the interface presented by an application
can vary widely. Although at the start of this chapter we said that the
human-computer interface was the Cinderella of the computer world, there has
been a steady improvement in the design of interfaces over the last few
years. Typical approaches to interface design are template, menu, command
language, and natural language.
Template
The template is the simplest of computer
interfaces because it is analogous to the type of forms you have to complete
in daily life. The template or form fillin provides the user with a
structure containing headings and blank spaces. The user is invited to fill
in the blanks on the form. This interface can be used by almost anyone—the
expert need enter only the information actually required, and the novice can
often relate to the form because it is a familiar metaphor. However, some
novices might experience frustration if the interface asks for Date of
birth, and will accept only, say, 27/09/48, and reject 27.09.48, or
27-09-48, or 27/09/1948, and so on.
In recent years, the template approach to interface
design has become more sophisticated with the advent of languages like
Visual Basic that enable you to design very sophisticated interfaces, and
the interfaces are, in turn, composed of ever more sophisticated widgets.
The following table demonstrates the extensive template provided by Visual Basic to
implement an option button in a Windows environment. As you can see,
this template provides complete customization of the widget (i.e., option
button).

The use of the template in Visual Basic to specify a
widget’s characteristics
The Menu
In a menu-driven system, the interface provides the user
with a list of alternatives, one of which is selected either by a mouse or
by appropriate cursor movement keys. A menu puts less pressure on the user
to remember syntax; for example, if you select the copy option from a menu,
the interface might then provide supplementary boxes labeled source
and destination.
The following illustrates a typical menu-based application
in PageMaker 5. The Type menu has been selected
to display a pull-down menu. On this secondary menu the Alignment
option has been selected and you are invited to choose one of the options
from this submenu.

Menu based systems can be used by all levels of computer
user. However, the menu is not very helpful if you cannot see it. I have
worked with systems that have proven very irritating because I have
forgotten which menu includes one of the commands. For example, if one of
the menu items is labeled Edit, you can be sure to find the commands
search and replace as options. However, if you wish to change
the paper size, is the appropriate paper size command under the File
menu, the Edit menu, the View menu, or the Format menu,
etc?
The Pull-down Menu
Another important aspect of menus is their ease of rapid
access. If a particular operation has to be activated by selecting a submenu
of a submenu of a menu, the sequence of keystrokes or mouse movements can be
irritatingly slow. As long as such a command is performed very infrequently
and all really common operations can be accessed speedily, all is well. If
you find that you have to frequently perform some of the tasks thought
obscure by the interface designer, the interface can seem leaden. The best
menu-based systems permit users to define their own menus in order to tailor
the system to individual need. However, this tailorization brings with it
its own problems, namely the user’s ability to tailor his or her own
environment. A novice probably cannot create as effective an interface as an
expert. Moreover, an excessive degree of tailorization can lead to screen
clutter—before long the menus take up more space than the applications.
Command Language
The command language is the oldest and, sometimes,
most powerful computer interface. It is syntax driven and you have to
know its syntax and the meaning of all its commands before you can use it.
If you use MS DOS you can delete a file with the command DEL
filename. Similarly, if you are using a text editor and wish to replace the
text string chemistry with physics between the strings
Monday and Friday in a document, you might simply need to enter
the command:
C/chemistry/physics/Monday/Friday.
In the manual that describes the above change
command, its syntax might be formally expressed as:
C/<source string>/<destination string>/<first
occurrence/<last occurrence>.
In any reasonably complex system, a command language can
be used to any degree of proficiency only by an expert or by a frequent
computer user. Moreover, errors are easy to make.
The command language can include powerful operators
that enable a command to be repeated or its action modified. A typical
powerful command language interface is the UNIX operating system. You might
devise a command language that allows the command to operate on a data
stream; for example, Delete {file1, file2, file3, file4} might
indicate that the delete operation might be applied to all files between the
{ } brackets.
Command languages can even be used to handle
conditional operations; that is an action might be carried out only if a
condition is met. Conditional commands are useful when you wish to restrict
the scope of an action; for example the operation If in column 1 then
change/Monday/Tuesday might tell the system to make a text change but
only to text in column 1 of a document.
Natural Language
Natural language is the term used to describe the
language that humans use to communicate with each other. We will be
describing natural language again in the chapter on artificial intelligence.
I think it is true to say that one of the principal goals of computer
scientists is a natural language interface that will allow you to
communicate with a computer just as if it were another person. I must admit,
I’ve never entirely understood the desire to design a natural language
interface to a computer—in my experience humans don’t seem to do a very good
job communicating with each other. However, the perfect natural language
interface should enable a computer to interpret the meaning of a sentence
like "Please edit the file I created last Friday—the one about the job
application."
In practice, even the limited natural languages we have
today can often be used only within a certain domain; that is the
vocabulary and grammar is a highly restricted subset of a natural language.
Real natural languages require a vast database and semantic net to
resolve ambiguities. The ambiguities inherent in a natural language can
easily be seen in English—just consider the sentence "I saw her with a
telescope."
Some believe that the ultimate computer interface will
have been created when natural language processing is combined with speech
understanding and humans and computers will be able to communicate with each
other verbally. However, the following untrue story demonstrates that there
might be problems. During the testing and commissioning phase of a
computer-speech interface project, the computer is programmed to simulate
war games and a five-star general from the Pentagon is invited to take part
in the exercise. The battle being simulated is a classic civil war battle
and, at some point, the general is told that the enemy is advancing. He than
decides to test the computer’s response against what happened historically,
and says to the computer, "Well, should I attack, or should I retreat?" The
computer replies "Yes." The general, being impatient, says "Yes, WHAT?". The
computer, being polite, replies "Yes SIR."
Human computer interaction isn’t always initiated by the
human operator. Sometimes, the computer has to tell the operator that
something has gone wrong. We are now going to look at how the computer does
this.
Error Messages
Computers communicate with people by means of a
two-way dialogue; the computer invites you to enter a command or data,
and then responds to your input. In the real world, things go wrong. Errors
can be categorized into different types—typical errors are: typographic
(you mis-spell a command or name), capture (you enter the wrong
command and find yourself performing an activity you didn’t intend to),
description (the correct action is carried out but with the wrong object
or data), sequence (you forget where you are up to in a sequence of
operations), and mode (you are not operating in the mode you expect;
for example, you are in the operating system and attempt to enter word
processing commands).
In addition to user errors, the computer cannot always
safely respond to your request. Sometimes the operation is too
potentially dangerous to perform without confirmation. Finally, system
errors sometimes occur. These are errors put there by the designers of the
opperating system or applications package (some companies are very good
indeed at including lots of system errors in their software).
When the computer is forced to intervene, it does so by
means of a dialogue loosely called an error message. The world of the
error message is fraught with problems because three aspects of the
human-computer system come together: the structure of the application and
all its underlying software, the human interface itself, and human
psychology. I think that I can safely say that, with the sole exception of
the RS232C printer cable, no aspect of computing causes more anger and
frustration than the error message. Let’s look at some of the problems.
Have you ever been in an aircraft and heard the
announcement "Ladies and gentlemen, due to technical problems, the flight
will be delayed one hour." This is a form of error message. Unfortunately,
you are allowed to know nothing whatsoever about the nature of the problem,
and, in any case, the message was not delivered until well into the delay. I
once had a microprocessor development system that came up with the message
"Error 155" when I first switched it on. I looked in the instruction
handbook, found the error messages page, and decoded error message 155 as:
"Error Message 155—See supplier." So, I phoned the supplier who said,
"That’s interesting, we’ve always wondered what that meant ourselves...".
On the other hand, error messages can be too detailed or
inappropriate. Imagine the effect of the following in-flight announcement on
an aircraft, "Ladies and gentlemen, the flight is delayed as we have had to
reduce power to number three engine because the compressor is running hot.
It has been determined that 94% of all similar problems present no
difficulty, as long as the power is reduced. However, in about 1% of cases,
a fan blade shatters and punctures the engine housing." Clearly, too much
error information can be as bad as too little—especially if it has no
meaning to the user or the user cannot make use of the information.
One of the problems posed by an error message system is
in relating the actual error message to its cause. A user operating at the
applications level who is editing a file might receive an error message
indicating that the operation cannot be completed. But where did the error
massage come from, and what does it indicate?
Any reasonably complicated software has a hierarchical
structure. When the application software wishes to open a file, it might
call the disk operating systems’ file manager. In turn, the file manager
might call the operations required to handle files. Finally, the lowest
level software might be called to read a sector on the disk. An error can
occur within any of these levels.
Suppose, for example, an error occurs at the lowest level
because a word processor is told that a certain file is on a floppy disk,
and the disk drive is empty. In this case, it is reasonable (indeed,
necessary) for that message to be passed up through the layers of software
to the user. The user can understand the massage and act on it. On the other
hand, if a low-level error takes place because a sector can’t be read, the
user doesn’t want to know that sector track 32, sector is bad. The user
wants the system to make several attempts to read the file (because disk
errors are sometimes intermittent). If the error persists and is fatal, the
user wants advice—has all the data that was created in this session been
lost, or should the system save it in a secure place until the underlying
fault has been rectified?
A good error message system should achieve two goals.
First, the error messages should be appropriate to the skills of the user—a
beginner should not be left to figure out the meaning of the message.
Second, the errors messages should be helpful and suggest how the error
situation can be dealt with. We will soon look at help systems, which are
closely related to error message systems.
The Warning
Associated with the error message is the warning.
A warning is a message that is detected when a legal operation is
about to occur that may have potentially fatal consequences. The actual
operation itself is not illegal. Suppose you have been editing a file called
Chapter_5 for a few hours, and decide to take a back up copy of your
precious work. If the syntax of the copy operation is Copy
source,destination and you type Copy Temp_file,Chapter_5, you are
going to lose all your work. A well-designed system might spot that the
operation you are carrying out is legal but not sensible (few
users ever edit a file for hours and then overwrite it with something old
without saving the source). A suitable warning message might be "Do you
really want to overwrite Chapter_5 and lose all the changes?".
Warnings are not always effective. You’ve all heard the
story about the little boy who cried wolf. If you get a lot of
warnings and ignore most of them, there comes a point at which you do not
see the warning that should be heeded. This is particularly true of fire
alarms—people tend to ignore them and assume that it is a false alarm.
Suppose a warning message is followed by the request "Ignore Warning?"
inviting you to continue if the warning is not to be acted upon. After a
time, you get into the habit of anticipating this command and hitting the
ignore key even before the message comes up. Eventually, there comes the
moment when your finger is hitting the ignore key just as your brain is
thinking "No!" You could solve this problem by requiring the user to enter a
more complex response that, possibly, changes from time to time.
Unfortunately, the production of spurious warning and error messages can be
most frustrating.
The problem of effective warning messages can, to some
extent, be dealt with by providing several levels of automatic warnings. An
expert user might select a minimum level of computer intervention, whereas a
novice might select a level that provides a much greater degree of feedback.
Interestingly enough, tests have demonstrated that alarms
accompanied by verbal commands are often acted upon and not ignored.
Alarms or warnings that are accompanied by shrill sounds are sometimes
counter productive. There is at least one airline disaster during which the
crew appeared more concerned with shutting off the alarms than with keeping
the aircraft flying.
Automatic Error Correction
Some interfaces provide simple non-context specific
automatic or background error-correcting mechanisms. When the user
makes an error, the system attempts to correct the error, rather than to
indicate it. For example, if you type copyy rather than copy,
the system can be trained to recognize the error and corrects it
automatically. This type of error correction can be enhanced by using
context information; for example, if you are using certain specialized words
and you mis-spell one of them, the system can substitute the correct work
because it knows the vocabulary you are using. However, this system can be
tremendously frustrating when it corrects correct information; for example,
a very common typing error is "teh" caused by transposing the "e" and
"h" when typing "the". However, the name Teh is a common Singapori
name and the interface causes much irritation when, say, typing a class list
with several Teh’s. Systems with automatic error correction often allow you
to limit the scope of the error correction (i.e., turn-off some of
the rules used for error correction).
Computers can do more than deal with errors. They can
provide information about the application and simulate a User’s Manual. In
the next section we look at help systems.
HELP Systems
Not very long ago the concept of user friendliness
didn’t exist. If you bought a program and wanted to use a feature that you
didn’t fully understand, you had either to extract the information from the
manual, or find someone who could help you. Today, many applications include
a built in or on-line help system that can provide advice while you
are running the application.
A good help system should provide help when it is needed
and not clutter the screen with information you don’t require. There are
several different forms of help; you can provide a tutorial help
system to step through examples, a simple dictionary-based help system
that looks up the information you need, or a context sensitive help
system that gives you the "most" appropriate type of help.
Consider a snapshot of a help screen generated by asking about circles in a
session with CorelDRAW!

The HELP screen
In a typical system like CorelDRAW! the HELP function is
invoked by clicking on the Help command (normally located at the top
right-hand side of the screen in Windows-based applications). The HELP
command provides its own menu, and you may select the search for help on
item that invokes a dialogue box. You can then enter the
topic you require help on (or use the scroll bar to step through the
available topics), and the HELP system provides a list of related topics.
You then click on the topic you want and the HELP screen is
displayed.

The HELP dialogue box
This type of HELP system saves you the trouble of having
to look up information in a manual. It does suffer from some significant
limitations—if you don’t really know what you want, it can be very difficult
to find the information. Part of the problem is one of terminology. Suppose
you are a beginner using a word processor for the first time, and want to
use larger letters in a document. You might be tempted to search for help on
"large", "big", or "letter"—you might not know that the appropriate index
words are font or character.
Context-sensitive help systems enable you to locate help
on a particular topic without spending a lot of time searching. In a typical
system, you may use the cursor to select the facility you want (but without
clicking and launching the facility), and then press the HELP function key
to get the help on the topic you selected.
Another form of help is the on-line tutorial.
These are often animated picture shows constructed using multimedia
technology (i.e., they may include text, sound, and animated pictures). The
tutorial leads you through a presentation or demonstration of the software.
I must admit that my own personal experience of the on-line tutorial has
been negative. Many force you to move at a snail’s pace through material you
already know. With a book, you can always skip through the boring bits.
Intelligent help systems operate by creating a model
of the user’s expertise and then providing help within that framework. For
example, if a user has displayed a knowledge of an advanced topic by using a
series of commands in an appropriate fashion, an intelligent help system
does not provide an elementary level of detail when help is requested. The
goal of the intelligent help system is not necessarily to provide more help
than other systems but to provide more appropriate help.
In order to implement an intelligent help system you have
to be able to create a model of the information provided by the help system,
and a means of representing the user’s level of understanding of this
information. The intelligent help system is able to make deductions like:
IF the user has drawn a circle AND has selected the fill
tool THEN help will be related to filling in a polygon object.
Hypertext
One of the buzz words of the late 1980s was hypertext,
a system that promised to revolutionize the storage and manipulation of
information. Before the introduction of hypertext systems, the type of
information you would find in a book or manual (or even a novel) was stored
linearly, page-by-page. You access information in a book by skipping through
pages until you find what you want, or by looking up the information in the
index. A problem with the book or any other linear form of information is
that you have to go through it at the rate and in the order dictated by the
author.
In a hypertext system the information is structured in a
complex way and held together by large numbers of invisible links.
The next figure illustrates the concept underlying hypertext. Imagine that the
uppermost plane, marked visible plane, is a
conventional text item such as part of a computer manual. This level can be
read through page by page just like any other document.

Hypertext
The block dots or nodes in the upper level
represent words and phrases that are related to or linked to other items in
the hypertext. For example, one of the layers might be a definition layer.
Suppose you are browsing through a hypertext document and you wish to know
the definition of a term. You simply click on it and up comes a definition
(usually in a separate window). The precise mechanism for carrying this out
will vary from system to system. For example, words that are linked
elsewhere are sometime highlighted or in a different color. The advantage of
the hypertext mechanism is that it gathers the information you need—no two
readers will require the same sequence of hypertext links.
There are three hidden planes. Information
can be linked in many different ways. Suppose the hypertext was concerned
with history. The uppermost level might be a conventional narrative, while
the lower levels provide the supporting material. One level might cover
events going on in other societies at the same time. Another level might
give potted biographies of the personalities concerned. Another level might
give extracts from associated historical documents, and so on.
You might think that hypertext sounds almost too good to
be true. In a way, it is. Although the underlying concept is quite clearly
sound, it is strongly implementation dependent. As you can imagine, the
construction of a good hypertext system requires a lot of effort and labor.
I have seen systems that provide a very poor level of information and are
little more than a toy. For example, what is the point of a hypertext system
that provides postage-stamp size maps when most homes have much better
atlases on their bookshelves. Another problem associated with hypertext (and
help and on-line tutorial systems) is that of navigation. (i.e.,
finding your way around the links). You can sometimes become lost in the
mass of links between information and find it difficult to get back to a
previous position. Moreover, the author of the hypertext document creates
the links and therefore has to anticipate the future users’ needs.
The early 1990s saw the rise of a new technology—multimedia.
Essentially, multimedia implies more than one medium (e.g., text, sound,
still images, and animated video images). Hypermedia is an extension
of hypertext to include objects of any kind as links. In hypermedia, an
object can be a fragment of digitized sound that is played when its link is
activated. Other objects might be animated diagrams or even video clips.
Because both sound and image objects take up a lot of storage, hypermedia
systems normally require CD-ROMs to store the necessary amount of data.
The explosive growth of the Internet in the mid 1990s
has further increased the interest in hypertext and hypermedia. The Internet
is a wide-area network that links countless numbers of computers worldwide.
The Internet does not belong to any single government body or organization
and there is no controlling body. Internet users can access a global
hypermedia system by means of the world-wide web, WWW.
Groupware
The majority of computers are used by individuals
performing isolated tasks. However, several people often work
together on related tasks in organizations—this is sometimes known as
computer-supported cooperative work, CSCW. Software that supports the
cooperation of several users is called groupware. Ultimately,
groupware can lead to the so-called paperless office, where
electronic media replaces all conventional paperwork.
Groupware is concerned not only with interaction between
people and computers but with communication between people themselves.
Groups of people require communications facilities based on networking. An
office might have a standard memo that can be electronically transmitted to
any individual or groups of individuals within the organization. Diaries and
appointment books are shared to enable people to coordinate meetings and
similar group activities.
Groupware demands a consistent interface to all users to
reduce the cost of training and allow people to move easily between
terminals. Although the software has a constant interface, all users
don’t necessarily share the same interface. Clearly, not all users in
an organization will be able to access all the information. Similarly, the
interface must cater to both naive users and expert users.
One of the problems associated with groupware is
synchronization. Suppose two people decide to edit the same test at the
same time. If they both update a particular paragraph simultaneously, which
one does the system record?
A very simple example of groupware is provided by
Microsoft’s Windows for Workgroups. This is an extension of the popular
Windows operating system to a network. Users can access each other’s
files—subject to a simple system of permissions and passwords. Two tools are
provided to facilitate group operation, E-mail and a scheduler that allows
users to view each other’s diaries.
Research and the Human-Computer
Interface
New computer interfaces, both hardware and software, are
continually being introduced. Each manufacturer believes that their system
is better than that produced by the competitors. Although you can evaluate
the performance of various pieces of software on the basis of instruction
execution, it is much harder to evaluate a user interface. The user
interface "takes the human into the loop" and therefore its performance
depends on the characteristics of the human. Unlike computer hardware and
software, humans vary widely in terms of their performance. Moreover, they
can exhibit sharp learning curves as they learn from experience, and this
can decline as they get tired or bored.
Research into the human computer interface is made more
difficult because of human psychology. Research was once carried out into a
new way of writing programs, and the group being studied demonstrated an
increase in productivity that validated the new technique. However, when the
old technique was reintroduced, the programmers also demonstrated higher
productivity. It turned out that the result of being involved in an
experiment had more effect on the programmers than the changes in program
design that were being evaluated.
A typical experiment involving the user interface is
described by Gould, Lewis and Barnes in ACM transactions on Office
Information Systems (Vol. 3, No. 1, Jan 1985 pp22-34). Text editors and word
processors operate by moving a cursor about the screen. It is natural to ask
whether the speed of the cursor movement affects the user’s performance (and
therefore productivity)?
Earlier studies reported that users spend approximately
30% of their time manipulating the cursor—demonstrating that it is important
to optimize both the cursor control mechanism and its interface. However,
the study concluded that cursor speed didn’t affect the amount of time users
spent moving the cursor, and that the time devoted to cursor movement was in
the region of 9—14%. Consequently, there is little to be gained by devoting
a lot of time and energy to improving the speed at which a cursor can be
moved across a screen. |