Professional GEM - Part VIII - User interfaces: Difference between revisions

From Atari Wiki
Jump to navigation Jump to search
No edit summary
No edit summary
 
(One intermediate revision by one other user not shown)
Line 1: Line 1:
  +
{{Professional GEM}}
<pre>
 
   
   
  +
==And now for something completely different!==
Professional GEM 57
 
   
  +
In response to a number of requests, this installment of ST PRO GEM will be devoted to examining a few of the principles of computer/human interface design, or "religion" as some would have it. I'm going to start with basic ergonomic laws, and try to draw some conclusions which are fairly specific to designing for the ST. If this article meets with general approval, further "homilies" may appear at irregular intervals as part of the ST PRO GEM series.
   
  +
For those who did NOT ask for this topic, it seems fair to explain why your diet of hard-core technical information has been interrupted by a sermon! As a motivater, we might consider why some programs are said by reviewers to have a "hot" feel (and hence sell well!) while others are "confusing" or "boring".
PART VIII
 
   
  +
Alan Kay has said that "user interface is theatre". I think we may be able to take it further, and suggest that a successful program works a bit of magic, persuading the user to suspend his disbelief and enter an imaginary world behind the screen, whether it is the mathematical world of a spreadsheet, or the land of Pacman pursued by ghosts.
USER INTERFACES
 
   
  +
A reader of a novel or science fiction story also suspends disbelief to participate in the work. Bad grammar and clumsy plotting by the author are jarring, and break down the illusion. Similarly, a programmer who fails to pay attention to making his interface fast and consistent will annoy the user, and distract him from whatever care has been lavished on the functional core of the program.
   
   
  +
==Credit where it's due==
AND NOW FOR SOMETHING COMPLETELY DIFFERENT
 
   
  +
Before launching into the discussion of user interface, I should mention that the general treatment and many of the specific research results are drawn from Card, Newell, and Moran's landmark book on the topic, which is cited at the end of the article. Any errors in interpretation and application to GEM and the ST are entirely my own, however.
In response to a number of requests, this installment of ST
 
PRO GEM will be devoted to examining a few of the principles of
 
computer/human interface design, or "religion" as some would
 
have it. I'm going to start with basic ergonomic laws, and try
 
to draw some conclusions which are fairly specific to designing
 
for the ST. If this article meets with general approval,
 
further "homilies" may appear at irregular intervals as part of
 
the ST PRO GEM series.
 
   
For those who did NOT ask for this topic, it seems fair to
 
explain why your diet of hard-core technical information has
 
been interrupted by a sermon! As a motivater, we might consider
 
why some programs are said by reviewers to have a "hot" feel
 
(and hence sell well!) while others are "confusing" or
 
"boring".
 
   
  +
==Fingertips==
Alan Kay has said that "user interface is theatre". I
 
think we may be able to take it further, and suggest that a
 
successful program works a bit of magic, persuading the user to
 
suspend his disbelief and enter an imaginary world behind the
 
screen, whether it is the mathematical world of a spreadsheet,
 
or the land of Pacman pursued by ghosts.
 
   
  +
We'll start right at the user's fingers with the basic equation governing positioning of the mouse, Fitt's Law, which is given as
A reader of a novel or science fiction story also suspends
 
disbelief to participate in the work. Bad grammar and clumsy
 
plotting by the author are jarring, and break down the
 
illusion. Similarly, a programmer who fails to pay attention to
 
making his interface fast and consistent will annoy the user,
 
and distract him from whatever care has been lavished on the
 
functional core of the program.
 
   
  +
<pre>
  +
T = I * LOG2( D / S + .5)
  +
</pre>
   
  +
where T is the amount of time to move to a target, D is the distance of the target from the current position, and S is the size of the target, stated in equivalent units. LOG2 is the base 2 (binary) logarithm function, and I is a proportionality constant, about 100 milliseconds per bit, which corresponds to the human's "clock rate" for making incremental movements.
CREDIT WHERE IT'S DUE ?
 
   
  +
We can squeeze an amazing amount of information out of this formula when attempting to speed up an interface. Since motion time goes up with distance, we should arrange the screen with the usual working area near the center, so the mouse will have to move a smaller distance on average from a selected object to a menu or panel. Likewise, any items which are usually used together should be placed together.
Before launching into the discussion of user interface, I
 
should mention that the general treatment and many of the
 
specific research results are drawn from Card, Newell, and
 
Moran's landmark book on the topic, which is cited at the end of
 
the article. Any errors in interpretation and application to
 
GEM and the ST are entirely my own, however.
 
   
  +
The most common operations will have the greater impact on speed, so they should be closest to the working area and perhaps larger than other icons or menu entries. If you want to have all other operations take about the same time, then the targets farthest from the working area should be larger, and those closer may be proportionately smaller.
   
  +
Consider also the implications for dialogs. Small check boxes are out. Large buttons which are easy to hit are in. There should be ample space between selectable items to allow for positioning error. Dangerous options should be widely separated from common selections.
   
   
  +
==Muscles==
   
  +
Anyone who has used the ST Desktop for any period of time has probably noticed that his fingers now know where to find the File menu. This phenomenon is sometimes called "muscle memory", and its rate of onset is given by the Power Law of Practice:
   
  +
<pre>
 
  +
T(n) = T(1) * n ** (-a)
  +
</pre>
   
  +
where T(n) is the time on the nth trial, T(1) is the time on the first trial, and a is approximately 0.4. (I have appropriated ** from Fortran as an exponentiation operator, since C lacks one.)
   
  +
This first thing to note about the Power Law is that it only works if a target stays in the same place! This should be a potent argument against rearranging icons, menus, or dialogs without some explicit request by the user. The time to hit a target which moves around arbitrarily will always be T(1)!
Professional GEM Part VIII 58
 
   
  +
In many cases, the Power Law will also work for sequences of operations to even greater effect. If you are a touch typist, you can observe this effect by comparing how fast you can enter "the" in comparison to three random letters. We'll come back shortly to consider what we can do to encourage this phenomenon.
   
   
  +
==Eyes==
   
  +
Just as fingers are the way the user sends data to the computer, so the eyes are his channel from the machine. The rate at which information may be passed to the user is determined by the "cycle time" of his visual processor. Experimental results show that this time ranges between 50 and 200 milliseconds.
FINGERTIPS
 
   
  +
Events separated by 50 milliseconds or less are always perceived as a single event. Those separated by more than 200 milliseconds are always seen as separate. We can use these facts in optimizing user of the computer's power when driving the interface.
We'll start right at the user's fingers with the basic
 
equation governing positioning of the mouse, Fitt's Law, which
 
is given as
 
   
  +
Suppose your application's interface contains an icon which should be inverted when the mouse passes over it. We now know that flipping it within one twentieth of a second is necessary and sufficient. Therefore, if a "first cut" at the program achieves this performance, there is no need for further optimization, unless you want to interleave other operations. If it falls short, it will be necessary to do some assembly coding to achieve a smooth feel.
T = I * LOG2( D / S + .5)
 
   
  +
On the other hand, two actions which you want to appear distinct or convey two different pieces of information must be separated by an absolute minimum of a fifth of a second, even assuming that they occur in an identical location on which the user's attention is already focused.
where T is the amount of time to move to a target, D is the
 
distance of the target from the current position, and S is the
 
size of the target, stated in equivalent units. LOG2 is the
 
base 2 (binary) logarithm function, and I is a proportionality
 
constant, about 100 milliseconds per bit, which corresponds to
 
the human's "clock rate" for making incremental movements.
 
   
  +
We are able to influence the visual processing rate within the 50 to 200 millisecond range by changing the intensity of the stimulus presented. This can be done with color, by flashing a target, or by more subtle enhancements such as bold face type. For instance, most people using GEM soon become accustomed to the "paper white" background of most windows and dialogs. A dialog which uses a reverse color scheme, white letters on black, is visually shocking in its starkness, and will immediately draw the user's eyes.
We can squeeze an amazing amount of information out of this
 
formula when attempting to speed up an interface. Since motion
 
time goes up with distance, we should arrange the screen with
 
the usual working area near the center, so the mouse will have
 
to move a smaller distance on average from a selected object to
 
a menu or panel. Likewise, any items which are usually used
 
together should be placed together.
 
   
  +
It should be quickly added that stimulus enhancement will only work when it unambiguously draws attention to the target. Three or four blinking objects scattered around the screen are confusing, and worse than no enhancement at all!
The most common operations will have the greater impact on
 
speed, so they should be closest to the working area and perhaps
 
larger than other icons or menu entries. If you want to have
 
all other operations take about the same time, then the targets
 
farthest from the working area should be larger, and those
 
closer may be proportionately smaller.
 
   
Consider also the implications for dialogs. Small check
 
boxes are out. Large buttons which are easy to hit are in.
 
There should be ample space between selectable items to allow
 
for positioning error. Dangerous options should be widely
 
separated from common selections.
 
   
  +
==Short-term memory==
   
  +
Both the information gathered by the eyes and movement commands on their way to the hand pass through short-term memory (also called working memory). The amount of information which can be held in short-term memory at any one time is limited. You can demonstrate this limit on yourself by attempting to type a sheet of random numbers by looking back and forth from the numbers to the screen. If you are like most people, you will be able to remember between five and nine numbers at a time. So universal is this finding that it is sometimes called "the magic number seven, plus or minus two".
MUSCLES
 
   
  +
This short-term capacity sets a limit on the number of choices which the user can be expected to grasp at once. It suggests that the number of independent choices in a menu, for instance, should be around seven, and never exceed nine. If this limit is violated, then the user will have to take several glances, with pauses to think, in order to make a choice.
Anyone who has used the ST Desktop for any period of time
 
has probably noticed that his fingers now know where to find the
 
File menu. This phenomenon is sometimes called "muscle memory",
 
and its rate of onset is given by the Power Law of Practice:
 
   
T(n) = T(1) * n ** (-a)
 
   
  +
==Chucking==
where T(n) is the time on the nth trial, T(1) is the time on the
 
first trial, and a is approximately 0.4. (I have appropriated
 
** from Fortran as an exponentiation operator, since C lacks
 
one.)
 
   
  +
The effective capacity of short-term memory can be increased when several related items are mentally grouped as a "chunk". Humans automatically adopt this strategy to save themselves time. For instance, random numbers had to be used instead of text in the example above, because people do not type their native language as individual characters. Instead, they combine the letters into words and remember these chunks instead. Put another way, the characters are no longer considered as individual choices.
   
  +
A well designed interface should promote the use of chunking as a strategy by the user. One easy way is to gather together related options in a single place. This is one reason that like commands are grouped into a single menu which is hidden except for its title. If all of the menu options were "in the open", the user would be overwhelmed with dozens of alternatives at once. Instead, a "Show Info" command, for instance, becomes two chunks: pick File menu, then pick Show.
 
   
  +
Sometimes the interface can accomplish the chunking for the user. Consider the difference between a slider bar in a GEM program, and a three digit entry field in a text mode application. Obviously, the GEM user has fewer decisions to make in order to set the associated variable.
   
Professional GEM Part VIII 59
 
   
  +
==Think!==
   
  +
While we are puttering around trying to speed up the keyboard, the mouse, and the screen, the user is actually trying to get some work done. We need to back off now, and look at the ways of thinking, or cognitive processes, that go into accomplishing the job.
   
  +
The user's goal may be to enter and edit a letter, to retrieve information from a database, or simply draw a picture, but it probably has very little to do with programming. In fact, the Problem Space Principle says that the task can be described as a set of states of knowledge, a set of operators and associated constraints for changing the states, and the knowledge to choose the appropriate operator, which resides in the user's head.
This first thing to note about the Power Law is that it
 
only works if a target stays in the same place! This should be
 
a potent argument against rearranging icons, menus, or dialogs
 
without some explicit request by the user. The time to hit a
 
target which moves around arbitrarily will always be T(1)!
 
   
  +
Those with a background in systems theory can consider this as a somewhat abstract, but straightforward, statement in terms of state variables and operators. A programmer might compare the knowledge states to the values of variables, the operators to arithmetic and logic operations, the constraints to the rules of syntax, and the user's knowledge to the algorithm embodied by a program.
In many cases, the Power Law will also work for sequences
 
of operations to even greater effect. If you are a touch
 
typist, you can observe this effect by comparing how fast you
 
can enter "the" in comparison to three random letters. We'll
 
come back shortly to consider what we can do to encourage this
 
phenomenon.
 
   
   
  +
==Are we not men?==
EYES
 
   
  +
A rational person will try to attain his goals (get the job done) by changing the state of his problem space from its initial state to the goal state. The initial state, for instance, might be a blank word processor screen. The desired final state is to have a completed business letter on the screen.
Just as fingers are the way the user sends data to the
 
computer, so the eyes are his channel from the machine. The
 
rate at which information may be passed to the user is
 
determined by the "cycle time" of his visual processor.
 
Experimental results show that this time ranges between 50 and
 
200 milliseconds.
 
   
  +
The Rationality Principle says that the user's behavior in typing, mousing, and so on, can be explained by considering the tasks required to achieve the goal, the operators available to carry out the tasks, and the limitations on the user's knowledge, observations, and processing capacity. This sounds like the typical user of a computer program must spend a good deal of time scratching his head and wondering what to do next. In fact, one of Card and Moran's key results is that this is NOT what takes place.
Events separated by 50 milliseconds or less are always
 
perceived as a single event. Those separated by more than 200
 
milliseconds are always seen as separate. We can use these
 
facts in optimizing user of the computer's power when driving
 
the interface.
 
   
  +
What happens, in fact, is that the trained user strikes a sort of "modus vivendi" with his tool and adopts a set of repetitive, trained behavior patterns as the best way to get the job done. He may go so far as to ignore some functions of the program in order to set up a reliable pattern. What we are looking for is a way of measuring and predicting the "quality" of this trained behavior. Since using computers is a human endeavor, we should consider not only the speed with which the task is completed, but the degree of annoyance or pleasure associated with the process.
Suppose your application's interface contains an icon which
 
should be inverted when the mouse passes over it. We now know
 
that flipping it within one twentieth of a second is necessary
 
and sufficient. Therefore, if a "first cut" at the program
 
achieves this performance, there is no need for further
 
optimization, unless you want to interleave other operations.
 
If it falls short, it will be necessary to do some assembly
 
coding to achieve a smooth feel.
 
   
  +
Card and Moran constructed a series of behavioral models which they called GOMS models, for Goals-Operators-Methods-Selection. These models suggested that in the training process the user learned to combine the basic operators in sequences (chunks!) which then became methods for reaching the goals. Then these first level methods might be combined again into second level methods, and so forth, as the learning progressed.
On the other hand, two actions which you want to appear
 
distinct or convey two different pieces of information must be
 
separated by an absolute minimum of a fifth of a second, even
 
assuming that they occur in an identical location on which the
 
user's attention is already focused.
 
   
  +
The GOMS models were tested in a lengthy series of trials at Xerox PARC using a variety of word processing software. (Among the subjects of these experiments were the inventors of the windowing methods used in GEM!) The results were again surprising: the level of detail in the models was really unimportant!
We are able to influence the visual processing rate within
 
the 50 to 200 millisecond range by changing the intensity of the
 
stimulus presented. This can be done with color, by flashing a
 
target, or by more subtle enhancements such as bold face type.
 
For instance, most people using GEM soon become accustomed to
 
the "paper white" background of most windows and dialogs. A
 
dialog which uses a reverse color scheme, white letters on
 
   
  +
It turned out to be sufficient to merely count up the number of keystrokes, mouse movements, and thought intervals required by each task. After summing up all of the tasks, any extra time for the computer to respond, or the user to move his hands from keyboard to mouse, or eyes from screen to printed page is added in. This simplified version is called the Keystroke-Level Model.
   
  +
As an example of the Keystroke Model, consider the task of changing a mistyped letter on the screen of a GEM word processor. This might be broken down as follows:
 
   
  +
1) find the letter on the screen;
  +
2) move hand to mouse;
  +
3) point to letter;
  +
4) click mouse button;
  +
5) move hand to keyboard;
  +
6) strike "Delete" key;
  +
7) strike key for new character.
   
  +
The sufficiency of the Keystroke Model is great news for our attempt to design faster interfaces. It says we can concentrate our efforts on minimizing the number of total actions to be taken, and making sure that each action is as fast as possible. We have already discussed some ways to speed up the mouse and keyboard actions, so let's now consider how to speed up the thought intervals, and cut the number of actions.
Professional GEM Part VIII 60
 
   
  +
One way to cut down "think time" is to make sure that the capacity of short-term memory is not exceeded during the course of a task. For example, the fix-a-letter task described above required the user to remember
   
  +
1) his place in the overall job of typing the document;
black, is visually shocking in its starkness, and will
 
  +
2) the task he is about to perform;
immediately draw the user's eyes.
 
  +
3) where the bad character appeared, and
  +
4) what the new character was.
   
  +
When this total of items creeps toward seven, the user often loses his place and commits errors.
It should be quickly added that stimulus enhancement will
 
only work when it unambiguously draws attention to the target.
 
Three or four blinking objects scattered around the screen are
 
confusing, and worse than no enhancement at all!
 
   
  +
You can appreciate the ubiquity of this problem by considering how many times you have made mistakes nesting parentheses, or had to go back to count them, because too many things happened while typing the line to remember the nesting levels. The moral is that operations with long strings of operands should be avoided when designing an interface.
   
  +
The single most important factor in making an interface comfortable to use is increasing its predictability, and decreasing the amount of indecision present at each step during a task. There is (inevitably) an Uncertainty Principle which relates the number of choices at each step to the associated time for thought:
SHORT-TERM MEMORY
 
   
  +
<pre>
Both the information gathered by the eyes and movement
 
  +
T = I * LOG2 ( N + 1)
commands on their way to the hand pass through short-term memory
 
  +
</pre>
(also called working memory). The amount of information which
 
can be held in short-term memory at any one time is limited.
 
You can demonstrate this limit on yourself by attempting to type
 
a sheet of random numbers by looking back and forth from the
 
numbers to the screen. If you are like most people, you will be
 
able to remember between five and nine numbers at a time. So
 
universal is this finding that it is sometimes called "the magic
 
number seven, plus or minus two".
 
   
  +
where LOG2 is the binary logarithm function, N is the number of equally probable choices, and I is a constant of approximately 140 msec/bit. When the alternates are not equally probable, the function is more complex:
This short-term capacity sets a limit on the number of
 
choices which the user can be expected to grasp at once. It
 
suggests that the number of independent choices in a menu, for
 
instance, should be around seven, and never exceed nine. If
 
this limit is violated, then the user will have to take several
 
glances, with pauses to think, in order to make a choice.
 
   
  +
<pre>
  +
T = I * SUM-FOR-i-FROM-1-TO-N (P(i) * LOG2( 1 / P(i) + 1) )
  +
</pre>
   
  +
where the P(i) are the probabilities of each of the choices (which must sum to one). (SUM-FOR-i... is the best I can do for a sigma operator on-line!) Those of you with some information theory background will recognize this formula as the entropy of the decision; we'll come back to that later.
CHUCKING
 
   
  +
So what can we learn from this hash? It turns out, as we might expect, that we can decrease the decision time by making some of the user's choices more probable than others. We do that by means of feedback cues from the interface.
The effective capacity of short-term memory can be
 
increased when several related items are mentally grouped as a
 
"chunk". Humans automatically adopt this strategy to save
 
themselves time. For instance, random numbers had to be used
 
instead of text in the example above, because people do not type
 
their native language as individual characters. Instead, they
 
combine the letters into words and remember these chunks
 
instead. Put another way, the characters are no longer
 
considered as individual choices.
 
   
  +
The important of reliable, continuous meaningful feedback cannot be emphasized enough. It helps the beginner learn the system, and its predictability makes the program comfortable for the expert. Programs with no feedback, or unreliable cues, produce confusion, dissonance, and frustration in the user.
A well designed interface should promote the use of
 
chunking as a strategy by the user. One easy way is to gather
 
together related options in a single place. This is one reason
 
that like commands are grouped into a single menu which is
 
hidden except for its title. If all of the menu options were
 
"in the open", the user would be overwhelmed with dozens of
 
alternatives at once. Instead, a "Show Info" command, for
 
instance, becomes two chunks: pick File menu, then pick Show.
 
   
  +
This principle is so important that I going to give several examples from common GEM practice. The Desktop provides several instances. When an object is selected and a menu drops down, only those choices which are legal for the object are in black. The others are dimmed to grey, and are therefore removed from the decision. When a pick is made from the menu, the bar entry remains black until the operation is complete, reassuring the user that the correct choice was made. In both the Desktop and the RCS, items which are double-clicked open up with a "zoom box" from the object, again showing that the right object was picked.
   
  +
Other techniques are useful when operator icons are exposed on the screen. When an object is picked, the legal operations might be outlined, or the bad choices might be dimmed. If the screen flashing produced by this is objectionable, the legal icons can be made mouse sensitive, so they will "light up" when the cursor passes over - again showing the user which choices are legal.
   
  +
The desire for feedback is so strong that it should be provided even while the computer is doing an operation on its own. The hour glass mouse form is a primitive example of this. More sophisticated are "progress indicators" such as animated thermometer bars, clocks, or text displays of the processing steps. The ST Desktop provides examples in the Format and Disk Copy functions. The purpose of all of these is to reassure the user that the operation is progressing normally. Their lack can lead to amusing spectacles such as secretaries leaning over to hear if their disk drives are working!
   
  +
Another commonly overlooked feature is error prevention and correction. Card and Moran's results showed that in order to go faster, people will tolerate error rates of up to 30% in their work. Any program which does not give a fast way to fix mistakes will be frustrating indeed!
 
   
  +
The best way to cope with an error is to "make it didn't happen", to quote a common child's phrase. The same feedback methods discussed above are also effective in preventing the user from picking inappropriate combinations of objects and operations. Replacement of numeric type-ins with sliders or other visual controls eliminates the common "Range Error". The use of radio buttons prevents the user from picking incompatible options. When such techniques are used consistently, the beginner also gains confidence that he may explore the program without blundering into errors.
   
  +
Once an error has occured, the best solution is to have an "inverse operation" immediately available. For instance, the way to fix a bad character is to hit the backspace key. If a line is inadvertantly deleted, there should be a way to restore it.
Professional GEM Part VIII 61
 
   
  +
Sometimes the mechanics of providing true inverses are impractical, or end up cluttering the interface themselves. In these cases, a global "Undo" command should be provided to reverse the effect of the last operation, no matter what it was.
   
   
  +
==Of modes and bandwidths==
Sometimes the interface can accomplish the chunking for the
 
user. Consider the difference between a slider bar in a GEM
 
program, and a three digit entry field in a text mode
 
application. Obviously, the GEM user has fewer decisions to
 
make in order to set the associated variable.
 
   
  +
Now I am going to depart from the Card, Newell and Moran thread of discussion to consider how we can minimize the number of operations in a task by altering the modes of the interface. Although "no modes" has been a watchword of Macintosh developers, the term may need definition for Atarians.
   
  +
Simply stated, a mode exists any time you cannot get to all of the capabilities of the program without taking some intermediate step. Familiar examples are old-style "menu-driven" programs, in which user must make selections from a number of nested menus in order to perform any operation. The options of any one menu are unavailable from the others.
THINK !
 
   
  +
Recall that the user is trying to accomplish work in his own problem space, by altering its states. A mode in the program adds additional states to the problem space, which he is forced to consider in order to get the job done. We might call an interface which is completely modeless "transparent", because it adds no states between the user and his work. One of the best examples of a transparent program is the 15-puzzle in the Macintosh desk accessory set. The problem space of rearranging the tiles is identical between the program and a physical puzzle.
While we are puttering around trying to speed up the
 
keyboard, the mouse, and the screen, the user is actually trying
 
to get some work done. We need to back off now, and look at the
 
ways of thinking, or cognitive processes, that go into
 
accomplishing the job.
 
   
  +
Unfortunately, most programmers find themselves forced to put modes of some sort into their programs. These often arise due to technological limitations, such as memory space, screen "real estate", or performance limitations of peripherals. The question is how the modes can be made least offensive.
The user's goal may be to enter and edit a letter, to
 
retrieve information from a database, or simply draw a picture,
 
but it probably has very little to do with programming. In
 
fact, the Problem Space Principle says that the task can be
 
described as a set of states of knowledge, a set of operators
 
and associated constraints for changing the states, and the
 
knowledge to choose the appropriate operator, which resides in
 
the user's head.
 
   
  +
I will make the general claim that the frustration which a mode produces is directly proportional to the amount of the user's bandwidth which it consumes. In other words, we need to consider how many keystrokes, mouse clicks, eye movements, and so on, are going into manipulating the true problem states, and how many are being absorbed by the modes of the program. If the interface is wasting a large amount of the user's effort, it will be perceived as slow and annoying.
Those with a background in systems theory can consider this
 
as a somewhat abstract, but straightforward, statement in terms
 
of state variables and operators. A programmer might compare
 
the knowledge states to the values of variables, the operators
 
to arithmetic and logic operations, the constraints to the rules
 
of syntax, and the user's knowledge to the algorithm embodied by
 
a program.
 
   
  +
Here we can consider again the hierarchy of goals and methods which the user employs. When the mode is low in the hierarchy, and close to the user's "fingertips", it is encountered the most frequently. For instance, consider how frustrating it would be to have to hit a function key before typing in each character!
   
  +
The "menu-driven" style of programs mentioned above are almost as bad, since usually only one piece of information is collected at each menu. Such a program becomes a labyrinth of states better suited to an adventure game!
ARE WE NOT MEN ?
 
   
  +
The least offensive modes are found at the higher, goal related levels of the hierarchy. The better they align with changes in the state of the original problem, the more they are tolerated. For example, a word processing program might have one screen layout for program editing, another for writing letters, and yet another while printing the documents. A multi-function business package might have one set of menus for the spreadsheet, another for a graphing module, and a third for a database.
A rational person will try to attain his goals (get the job
 
done) by changing the state of his problem space from its
 
initial state to the goal state. The initial state, for
 
instance, might be a blank word processor screen. The desired
 
final state is to have a completed business letter on the
 
screen.
 
   
  +
In some cases the problem solved by the program has convenient "fracture lines" which can be used to define the modes. An example in my own past is the RCS, where the editing of each type of resource tree forms its own mode, with each of the modes nested within the overall mode and problem of composing the entire resource tree.
The Rationality Principle says that the user's behavior in
 
typing, mousing, and so on, can be explained by considering the
 
tasks required to achieve the goal, the operators available to
 
carry out the tasks, and the limitations on the user's
 
knowledge, observations, and processing capacity. This sounds
 
like the typical user of a computer program must spend a good
 
deal of time scratching his head and wondering what to do next.
 
In fact, one of Card and Moran's key results is that this is NOT
 
what takes place.
 
   
   
+
==To do is to be!==
   
  +
Any narrative description of user interface is bound to be lacking. There is no way text can convey the vibrancy and tactile pleasure of a good interface, or the sullen boredom of a bad one. Therefore, I encourage you to experiment. Get out your favorite arcade game and see if you can spot some of the elements I have described. Dig into your slush pile for the most annoying program you have ever seen, run it and see if you can see mistakes. How would you fix them? Then... go do it to your own program!
   
Professional GEM Part VIII 62
 
   
  +
==Amen...==
   
  +
This concludes the sermon. I'd like some Feedback as to whether you found this Boring Beyond Belief or Really Hot Stuff. If enough people are interested, homily number two will appear a few episodes from now. The very next installment of ST PRO GEM will go back to basics to explore VDI drawing primitives. In the meantime, you might investigate some of the Good Books on interface design referenced below.
   
What happens, in fact, is that the trained user strikes a
 
sort of "modus vivendi" with his tool and adopts a set of
 
repetitive, trained behavior patterns as the best way to get the
 
job done. He may go so far as to ignore some functions of the
 
program in order to set up a reliable pattern. What we are
 
looking for is a way of measuring and predicting the "quality"
 
of this trained behavior. Since using computers is a human
 
endeavor, we should consider not only the speed with which the
 
task is completed, but the degree of annoyance or pleasure
 
associated with the process.
 
   
  +
==References==
Card and Moran constructed a series of behavioral models
 
which they called GOMS models, for
 
Goals-Operators-Methods-Selection. These models suggested that
 
in the training process the user learned to combine the basic
 
operators in sequences (chunks!) which then became methods for
 
reaching the goals. Then these first level methods might be
 
combined again into second level methods, and so forth, as the
 
learning progressed.
 
   
  +
* Stuart K. Card, Thomas P. Moran, and Allen Newell, THE PSYCHOLOGY OF HUMAN-COMPUTER INTERACTION, Lawrence Erlbaum Associates, Hillsdale, New Jersey, 1983. (Fundamental and indispensible. The volume of experimental results make it weighty. The Good Parts are at the beginning and end.)
The GOMS models were tested in a lengthy series of trials
 
at Xerox PARC using a variety of word processing software.
 
(Among the subjects of these experiments were the inventors of
 
the windowing methods used in GEM!) The results were again
 
surprising: the level of detail in the models was really
 
unimportant!
 
   
  +
* "Macintosh User Interface Guidelines", in INSIDE MACINTOSH, Apple Computer, Inc., 1984. (Yes, Atarians, we have something to learn here. Though not everything "translates", this is a fine piece of principled design work. Read and appreciate.)
It turned out to be sufficient to merely count up the
 
number of keystrokes, mouse movements, and thought intervals
 
required by each task. After summing up all of the tasks, any
 
extra time for the computer to respond, or the user to move his
 
hands from keyboard to mouse, or eyes from screen to printed
 
page is added in. This simplified version is called the
 
Keystroke-Level Model.
 
   
  +
* James D. Foley, Victor L. Wallace, and Peggy Chan, "The Human Factors of Computer Graphics Interaction Techniques", IEEE Computer Graphics (CG & A), November 1984, pp. 13-48. (A good overview, including higher level topics which I have postponed to a later article. Excellent bibliography.)
As an example of the Keystroke Model, consider the task of
 
changing a mistyped letter on the screen of a GEM word
 
processor. This might be broken down as follows: 1) find the
 
letter on the screen; 2) move hand to mouse; 3) point to letter;
 
4) click mouse button; 5) move hand to keyboard; 6) strike
 
"Delete" key; 7) strike key for new character.
 
   
  +
* J. D. Foley and A. Van Dam, FUNDAMENTALS OF INTERACTIVE COMPUTER GRAPHICS, Addison Wesley, 1984, Chapters 5 and 6. (If you can't get the article above, read this. If you are designing graphics apps, buy the whole book! Staggering bibliography.)
The sufficiency of the Keystroke Model is great news for
 
our attempt to design faster interfaces. It says we can
 
concentrate our efforts on minimizing the number of total
 
actions to be taken, and making sure that each action is as fast
 
as possible. We have already discussed some ways to speed up
 
the mouse and keyboard actions, so let's now consider how to
 
speed up the thought intervals, and cut the number of actions.
 
 
 
 
 
 
 
 
Professional GEM Part VIII 63
 
 
 
 
One way to cut down "think time" is to make sure that the
 
capacity of short-term memory is not exceeded during the course
 
of a task. For example, the fix-a-letter task described above
 
required the user to remember 1) his place in the overall job of
 
typing the document; 2) the task he is about to perform; 3)
 
where the bad character appeared, and 4) what the new character
 
was. When this total of items creeps toward seven, the user
 
often loses his place and commits errors.
 
 
You can appreciate the ubiquity of this problem by
 
considering how many times you have made mistakes nesting
 
parentheses, or had to go back to count them, because too many
 
things happened while typing the line to remember the nesting
 
levels. The moral is that operations with long strings of
 
operands should be avoided when designing an interface.
 
 
The single most important factor in making an interface
 
comfortable to use is increasing its predictability, and
 
decreasing the amount of indecision present at each step during
 
a task. There is (inevitably) an Uncertainty Principle which
 
relates the number of choices at each step to the associated
 
time for thought:
 
 
T = I * LOG2 ( N + 1)
 
 
where LOG2 is the binary logarithm function, N is the number of
 
equally probable choices, and I is a constant of approximately
 
140 msec/bit. When the alternates are not equally probable, the
 
function is more complex:
 
 
T = I * SUM-FOR-i-FROM-1-TO-N (P(i) * LOG2( 1 / P(i) + 1) )
 
 
where the P(i) are the probabilities of each of the choices
 
(which must sum to one). (SUM-FOR-i... is the best I can do for
 
a sigma operator on-line!) Those of you with some information
 
theory background will recognize this formula as the entropy of
 
the decision; we'll come back to that later.
 
 
So what can we learn from this hash? It turns out, as we
 
might expect, that we can decrease the decision time by making
 
some of the user's choices more probable than others. We do
 
that by means of feedback cues from the interface.
 
 
The important of reliable, continuous meaningful feedback
 
cannot be emphasized enough. It helps the beginner learn the
 
system, and its predictability makes the program comfortable for
 
the expert. Programs with no feedback, or unreliable cues,
 
produce confusion, dissonance, and frustration in the user.
 
 
This principle is so important that I going to give several
 
examples from common GEM practice. The Desktop provides several
 
 
 
 
 
 
Professional GEM Part VIII 64
 
 
 
instances. When an object is selected and a menu drops down,
 
only those choices which are legal for the object are in black.
 
The others are dimmed to grey, and are therefore removed from
 
the decision. When a pick is made from the menu, the bar entry
 
remains black until the operation is complete, reassuring the
 
user that the correct choice was made. In both the Desktop and
 
the RCS, items which are double-clicked open up with a "zoom
 
box" from the object, again showing that the right object was
 
picked.
 
 
Other techniques are useful when operator icons are exposed
 
on the screen. When an object is picked, the legal operations
 
might be outlined, or the bad choices might be dimmed. If the
 
screen flashing produced by this is objectionable, the legal
 
icons can be made mouse sensitive, so they will "light up" when
 
the cursor passes over - again showing the user which choices
 
are legal.
 
 
The desire for feedback is so strong that it should be
 
provided even while the computer is doing an operation on its
 
own. The hour glass mouse form is a primitive example of this.
 
More sophisticated are "progress indicators" such as animated
 
thermometer bars, clocks, or text displays of the processing
 
steps. The ST Desktop provides examples in the Format and Disk
 
Copy functions. The purpose of all of these is to reassure the
 
user that the operation is progressing normally. Their lack can
 
lead to amusing spectacles such as secretaries leaning over to
 
hear if their disk drives are working!
 
 
Another commonly overlooked feature is error prevention and
 
correction. Card and Moran's results showed that in order to go
 
faster, people will tolerate error rates of up to 30% in their
 
work. Any program which does not give a fast way to fix
 
mistakes will be frustrating indeed!
 
 
The best way to cope with an error is to "make it didn't
 
happen", to quote a common child's phrase. The same feedback
 
methods discussed above are also effective in preventing the
 
user from picking inappropriate combinations of objects and
 
operations. Replacement of numeric type-ins with sliders or
 
other visual controls eliminates the common "Range Error". The
 
use of radio buttons prevents the user from picking incompatible
 
options. When such techniques are used consistently, the
 
beginner also gains confidence that he may explore the program
 
without blundering into errors.
 
 
Once an error has occured, the best solution is to have an
 
"inverse operation" immediately available. For instance, the
 
way to fix a bad character is to hit the backspace key. If a
 
line is inadvertantly deleted, there should be a way to restore
 
it.
 
 
 
 
 
 
 
Professional GEM Part VIII 65
 
 
 
 
Sometimes the mechanics of providing true inverses are
 
impractical, or end up cluttering the interface themselves. In
 
these cases, a global "Undo" command should be provided to
 
reverse the effect of the last operation, no matter what it
 
was.
 
 
 
OF MODES AND BANDW
 
 
Now I am going to depart from the Card, Newell and Moran
 
thread of discussion to consider how we can minimize the number
 
of operations in a task by altering the modes of the interface.
 
Although "no modes" has been a watchword of Macintosh
 
developers, the term may need definition for Atarians.
 
 
Simply stated, a mode exists any time you cannot get to all
 
of the capabilities of the program without taking some
 
intermediate step. Familiar examples are old-style
 
"menu-driven" programs, in which user must make selections from
 
a number of nested menus in order to perform any operation. The
 
options of any one menu are unavailable from the others.
 
 
Recall that the user is trying to accomplish work in his
 
own problem space, by altering its states. A mode in the
 
program adds additional states to the problem space, which he is
 
forced to consider in order to get the job done. We might call
 
an interface which is completely modeless "transparent", because
 
it adds no states between the user and his work. One of the
 
best examples of a transparent program is the 15-puzzle in the
 
Macintosh desk accessory set. The problem space of rearranging
 
the tiles is identical between the program and a physical
 
puzzle.
 
 
Unfortunately, most programmers find themselves forced to
 
put modes of some sort into their programs. These often arise
 
due to technological limitations, such as memory space, screen
 
"real estate", or performance limitations of peripherals. The
 
question is how the modes can be made least offensive.
 
 
I will make the general claim that the frustration which a
 
mode produces is directly proportional to the amount of the
 
user's bandwidth which it consumes. In other words, we need to
 
consider how many keystrokes, mouse clicks, eye movements, and
 
so on, are going into manipulating the true problem states, and
 
how many are being absorbed by the modes of the program. If the
 
interface is wasting a large amount of the user's effort, it
 
will be perceived as slow and annoying.
 
 
Here we can consider again the hierarchy of goals and
 
methods which the user employs. When the mode is low in the
 
hierarchy, and close to the user's "fingertips", it is
 
 
 
 
 
 
Professional GEM Part VIII 66
 
 
 
encountered the most frequently. For instance, consider how
 
frustrating it would be to have to hit a function key before
 
typing in each character!
 
 
The "menu-driven" style of programs mentioned above are
 
almost as bad, since usually only one piece of information is
 
collected at each menu. Such a program becomes a labyrinth of
 
states better suited to an adventure game!
 
 
The least offensive modes are found at the higher, goal
 
related levels of the hierarchy. The better they align with
 
changes in the state of the original problem, the more they are
 
tolerated. For example, a word processing program might have
 
one screen layout for program editing, another for writing
 
letters, and yet another while printing the documents. A
 
multi-function business package might have one set of menus for
 
the spreadsheet, another for a graphing module, and a third for
 
a database.
 
 
In some cases the problem solved by the program has
 
convenient "fracture lines" which can be used to define the
 
modes. An example in my own past is the RCS, where the editing
 
of each type of resource tree forms its own mode, with each of
 
the modes nested within the overall mode and problem of
 
composing the entire resource tree.
 
 
 
TO DO IS TO BE !
 
 
Any narrative description of user interface is bound to be
 
lacking. There is no way text can convey the vibrancy and
 
tactile pleasure of a good interface, or the sullen boredom of a
 
bad one. Therefore, I encourage you to experiment. Get out
 
your favorite arcade game and see if you can spot some of the
 
elements I have described. Dig into your slush pile for the
 
most annoying program you have ever seen, run it and see if you
 
can see mistakes. How would you fix them? Then... go do it to
 
your own program!
 
 
 
AMEN...
 
 
This concludes the sermon. I'd like some Feedback as to
 
whether you found this Boring Beyond Belief or Really Hot
 
Stuff. If enough people are interested, homily number two will
 
appear a few episodes from now. The very next installment of
 
ST PRO GEM will go back to basics to explore VDI drawing
 
primitives. In the meantime, you might investigate some of the
 
Good Books on interface design referenced below.
 
 
 
 
 
 
 
</pre>
 
   
  +
* Ben Schneidermann, "Direct Manipulation: A Step Beyond Programming Languages", IEEE Computer, August 1983, pp. 57-69. (What do Pacman and Visicalc have in common? Schneidermann's analysis is vital to creating hot interfaces.)
Back to [[Professional_GEM]]
 

Latest revision as of 23:16, 12 October 2006

Professional GEM
Part I -- Windows
In the beginningOpen sesameCleaning upThose fat slidersComing up nextFeedback
Part II -- Windows
ExcelsiorRedrawing windowsCaveat emptorInto the bitsA small confessionWindow control requestWindow slider messagesA common bugDept. of dirty tricksA sin of omissionComing soon
Part III -- The dialog handler
A meaningful dialogDefining termsBug alert!A handy trickClean upRecapButton ButtonWho's got the button?Coming up nextDispell gremlins
Part IV -- Resource structure
A maze of twisty little passagesPutting it to workLetters, we get lettersStraw poll!Stay tuned!
Part V -- Resource tree structures
How GEM does itThought experimentsA treewalker of our own
Part VI -- Raster operations
Seasons greetingsDefining termsMonochrome vs. colorStandard vs. device-specific formatEven-word vs. fringesMFDB'sLet's operateTransform formCopy raster opaqueCopy raster transparentThe mode parameterReplace modeErase modeXor modeTransparent modeReverse transparent modeThe problem of colorOptimizing raster operationsAvoid merged copiesMove to corresponding pixelsAvoid fringesUse another methodFeedback resultsThe next questionComing up soon
Part VII -- Menu structures
Happy new yearMenu basicsMenu structuresUsing the menuGetting fancyCheck please?Now you see it now you don'tLunch and dinner menusDo it yourselfMake prettyThat's it for now!
Part VIII -- User interfaces
And now for something completely different!Credit where it's dueFingertipsMusclesEyesShort-term memoryChunkingThink!Are we not men?Of modes and bandwidthTo do is to be!Amen...
Part IX -- VDI Graphics: Lines and solids
A bit of historyThe line forms on the leftSolidsTo be continued
Appendices
Main page


And now for something completely different!

In response to a number of requests, this installment of ST PRO GEM will be devoted to examining a few of the principles of computer/human interface design, or "religion" as some would have it. I'm going to start with basic ergonomic laws, and try to draw some conclusions which are fairly specific to designing for the ST. If this article meets with general approval, further "homilies" may appear at irregular intervals as part of the ST PRO GEM series.

For those who did NOT ask for this topic, it seems fair to explain why your diet of hard-core technical information has been interrupted by a sermon! As a motivater, we might consider why some programs are said by reviewers to have a "hot" feel (and hence sell well!) while others are "confusing" or "boring".

Alan Kay has said that "user interface is theatre". I think we may be able to take it further, and suggest that a successful program works a bit of magic, persuading the user to suspend his disbelief and enter an imaginary world behind the screen, whether it is the mathematical world of a spreadsheet, or the land of Pacman pursued by ghosts.

A reader of a novel or science fiction story also suspends disbelief to participate in the work. Bad grammar and clumsy plotting by the author are jarring, and break down the illusion. Similarly, a programmer who fails to pay attention to making his interface fast and consistent will annoy the user, and distract him from whatever care has been lavished on the functional core of the program.


Credit where it's due

Before launching into the discussion of user interface, I should mention that the general treatment and many of the specific research results are drawn from Card, Newell, and Moran's landmark book on the topic, which is cited at the end of the article. Any errors in interpretation and application to GEM and the ST are entirely my own, however.


Fingertips

We'll start right at the user's fingers with the basic equation governing positioning of the mouse, Fitt's Law, which is given as

 T = I * LOG2( D / S + .5)

where T is the amount of time to move to a target, D is the distance of the target from the current position, and S is the size of the target, stated in equivalent units. LOG2 is the base 2 (binary) logarithm function, and I is a proportionality constant, about 100 milliseconds per bit, which corresponds to the human's "clock rate" for making incremental movements.

We can squeeze an amazing amount of information out of this formula when attempting to speed up an interface. Since motion time goes up with distance, we should arrange the screen with the usual working area near the center, so the mouse will have to move a smaller distance on average from a selected object to a menu or panel. Likewise, any items which are usually used together should be placed together.

The most common operations will have the greater impact on speed, so they should be closest to the working area and perhaps larger than other icons or menu entries. If you want to have all other operations take about the same time, then the targets farthest from the working area should be larger, and those closer may be proportionately smaller.

Consider also the implications for dialogs. Small check boxes are out. Large buttons which are easy to hit are in. There should be ample space between selectable items to allow for positioning error. Dangerous options should be widely separated from common selections.


Muscles

Anyone who has used the ST Desktop for any period of time has probably noticed that his fingers now know where to find the File menu. This phenomenon is sometimes called "muscle memory", and its rate of onset is given by the Power Law of Practice:

 T(n) = T(1) * n ** (-a)

where T(n) is the time on the nth trial, T(1) is the time on the first trial, and a is approximately 0.4. (I have appropriated ** from Fortran as an exponentiation operator, since C lacks one.)

This first thing to note about the Power Law is that it only works if a target stays in the same place! This should be a potent argument against rearranging icons, menus, or dialogs without some explicit request by the user. The time to hit a target which moves around arbitrarily will always be T(1)!

In many cases, the Power Law will also work for sequences of operations to even greater effect. If you are a touch typist, you can observe this effect by comparing how fast you can enter "the" in comparison to three random letters. We'll come back shortly to consider what we can do to encourage this phenomenon.


Eyes

Just as fingers are the way the user sends data to the computer, so the eyes are his channel from the machine. The rate at which information may be passed to the user is determined by the "cycle time" of his visual processor. Experimental results show that this time ranges between 50 and 200 milliseconds.

Events separated by 50 milliseconds or less are always perceived as a single event. Those separated by more than 200 milliseconds are always seen as separate. We can use these facts in optimizing user of the computer's power when driving the interface.

Suppose your application's interface contains an icon which should be inverted when the mouse passes over it. We now know that flipping it within one twentieth of a second is necessary and sufficient. Therefore, if a "first cut" at the program achieves this performance, there is no need for further optimization, unless you want to interleave other operations. If it falls short, it will be necessary to do some assembly coding to achieve a smooth feel.

On the other hand, two actions which you want to appear distinct or convey two different pieces of information must be separated by an absolute minimum of a fifth of a second, even assuming that they occur in an identical location on which the user's attention is already focused.

We are able to influence the visual processing rate within the 50 to 200 millisecond range by changing the intensity of the stimulus presented. This can be done with color, by flashing a target, or by more subtle enhancements such as bold face type. For instance, most people using GEM soon become accustomed to the "paper white" background of most windows and dialogs. A dialog which uses a reverse color scheme, white letters on black, is visually shocking in its starkness, and will immediately draw the user's eyes.

It should be quickly added that stimulus enhancement will only work when it unambiguously draws attention to the target. Three or four blinking objects scattered around the screen are confusing, and worse than no enhancement at all!


Short-term memory

Both the information gathered by the eyes and movement commands on their way to the hand pass through short-term memory (also called working memory). The amount of information which can be held in short-term memory at any one time is limited. You can demonstrate this limit on yourself by attempting to type a sheet of random numbers by looking back and forth from the numbers to the screen. If you are like most people, you will be able to remember between five and nine numbers at a time. So universal is this finding that it is sometimes called "the magic number seven, plus or minus two".

This short-term capacity sets a limit on the number of choices which the user can be expected to grasp at once. It suggests that the number of independent choices in a menu, for instance, should be around seven, and never exceed nine. If this limit is violated, then the user will have to take several glances, with pauses to think, in order to make a choice.


Chucking

The effective capacity of short-term memory can be increased when several related items are mentally grouped as a "chunk". Humans automatically adopt this strategy to save themselves time. For instance, random numbers had to be used instead of text in the example above, because people do not type their native language as individual characters. Instead, they combine the letters into words and remember these chunks instead. Put another way, the characters are no longer considered as individual choices.

A well designed interface should promote the use of chunking as a strategy by the user. One easy way is to gather together related options in a single place. This is one reason that like commands are grouped into a single menu which is hidden except for its title. If all of the menu options were "in the open", the user would be overwhelmed with dozens of alternatives at once. Instead, a "Show Info" command, for instance, becomes two chunks: pick File menu, then pick Show.

Sometimes the interface can accomplish the chunking for the user. Consider the difference between a slider bar in a GEM program, and a three digit entry field in a text mode application. Obviously, the GEM user has fewer decisions to make in order to set the associated variable.


Think!

While we are puttering around trying to speed up the keyboard, the mouse, and the screen, the user is actually trying to get some work done. We need to back off now, and look at the ways of thinking, or cognitive processes, that go into accomplishing the job.

The user's goal may be to enter and edit a letter, to retrieve information from a database, or simply draw a picture, but it probably has very little to do with programming. In fact, the Problem Space Principle says that the task can be described as a set of states of knowledge, a set of operators and associated constraints for changing the states, and the knowledge to choose the appropriate operator, which resides in the user's head.

Those with a background in systems theory can consider this as a somewhat abstract, but straightforward, statement in terms of state variables and operators. A programmer might compare the knowledge states to the values of variables, the operators to arithmetic and logic operations, the constraints to the rules of syntax, and the user's knowledge to the algorithm embodied by a program.


Are we not men?

A rational person will try to attain his goals (get the job done) by changing the state of his problem space from its initial state to the goal state. The initial state, for instance, might be a blank word processor screen. The desired final state is to have a completed business letter on the screen.

The Rationality Principle says that the user's behavior in typing, mousing, and so on, can be explained by considering the tasks required to achieve the goal, the operators available to carry out the tasks, and the limitations on the user's knowledge, observations, and processing capacity. This sounds like the typical user of a computer program must spend a good deal of time scratching his head and wondering what to do next. In fact, one of Card and Moran's key results is that this is NOT what takes place.

What happens, in fact, is that the trained user strikes a sort of "modus vivendi" with his tool and adopts a set of repetitive, trained behavior patterns as the best way to get the job done. He may go so far as to ignore some functions of the program in order to set up a reliable pattern. What we are looking for is a way of measuring and predicting the "quality" of this trained behavior. Since using computers is a human endeavor, we should consider not only the speed with which the task is completed, but the degree of annoyance or pleasure associated with the process.

Card and Moran constructed a series of behavioral models which they called GOMS models, for Goals-Operators-Methods-Selection. These models suggested that in the training process the user learned to combine the basic operators in sequences (chunks!) which then became methods for reaching the goals. Then these first level methods might be combined again into second level methods, and so forth, as the learning progressed.

The GOMS models were tested in a lengthy series of trials at Xerox PARC using a variety of word processing software. (Among the subjects of these experiments were the inventors of the windowing methods used in GEM!) The results were again surprising: the level of detail in the models was really unimportant!

It turned out to be sufficient to merely count up the number of keystrokes, mouse movements, and thought intervals required by each task. After summing up all of the tasks, any extra time for the computer to respond, or the user to move his hands from keyboard to mouse, or eyes from screen to printed page is added in. This simplified version is called the Keystroke-Level Model.

As an example of the Keystroke Model, consider the task of changing a mistyped letter on the screen of a GEM word processor. This might be broken down as follows:

1) find the letter on the screen; 2) move hand to mouse; 3) point to letter; 4) click mouse button; 5) move hand to keyboard; 6) strike "Delete" key; 7) strike key for new character.

The sufficiency of the Keystroke Model is great news for our attempt to design faster interfaces. It says we can concentrate our efforts on minimizing the number of total actions to be taken, and making sure that each action is as fast as possible. We have already discussed some ways to speed up the mouse and keyboard actions, so let's now consider how to speed up the thought intervals, and cut the number of actions.

One way to cut down "think time" is to make sure that the capacity of short-term memory is not exceeded during the course of a task. For example, the fix-a-letter task described above required the user to remember

1) his place in the overall job of typing the document; 2) the task he is about to perform; 3) where the bad character appeared, and 4) what the new character was.

When this total of items creeps toward seven, the user often loses his place and commits errors.

You can appreciate the ubiquity of this problem by considering how many times you have made mistakes nesting parentheses, or had to go back to count them, because too many things happened while typing the line to remember the nesting levels. The moral is that operations with long strings of operands should be avoided when designing an interface.

The single most important factor in making an interface comfortable to use is increasing its predictability, and decreasing the amount of indecision present at each step during a task. There is (inevitably) an Uncertainty Principle which relates the number of choices at each step to the associated time for thought:

 T = I * LOG2 ( N + 1)

where LOG2 is the binary logarithm function, N is the number of equally probable choices, and I is a constant of approximately 140 msec/bit. When the alternates are not equally probable, the function is more complex:

 T = I * SUM-FOR-i-FROM-1-TO-N (P(i) * LOG2( 1 / P(i) + 1) )

where the P(i) are the probabilities of each of the choices (which must sum to one). (SUM-FOR-i... is the best I can do for a sigma operator on-line!) Those of you with some information theory background will recognize this formula as the entropy of the decision; we'll come back to that later.

So what can we learn from this hash? It turns out, as we might expect, that we can decrease the decision time by making some of the user's choices more probable than others. We do that by means of feedback cues from the interface.

The important of reliable, continuous meaningful feedback cannot be emphasized enough. It helps the beginner learn the system, and its predictability makes the program comfortable for the expert. Programs with no feedback, or unreliable cues, produce confusion, dissonance, and frustration in the user.

This principle is so important that I going to give several examples from common GEM practice. The Desktop provides several instances. When an object is selected and a menu drops down, only those choices which are legal for the object are in black. The others are dimmed to grey, and are therefore removed from the decision. When a pick is made from the menu, the bar entry remains black until the operation is complete, reassuring the user that the correct choice was made. In both the Desktop and the RCS, items which are double-clicked open up with a "zoom box" from the object, again showing that the right object was picked.

Other techniques are useful when operator icons are exposed on the screen. When an object is picked, the legal operations might be outlined, or the bad choices might be dimmed. If the screen flashing produced by this is objectionable, the legal icons can be made mouse sensitive, so they will "light up" when the cursor passes over - again showing the user which choices are legal.

The desire for feedback is so strong that it should be provided even while the computer is doing an operation on its own. The hour glass mouse form is a primitive example of this. More sophisticated are "progress indicators" such as animated thermometer bars, clocks, or text displays of the processing steps. The ST Desktop provides examples in the Format and Disk Copy functions. The purpose of all of these is to reassure the user that the operation is progressing normally. Their lack can lead to amusing spectacles such as secretaries leaning over to hear if their disk drives are working!

Another commonly overlooked feature is error prevention and correction. Card and Moran's results showed that in order to go faster, people will tolerate error rates of up to 30% in their work. Any program which does not give a fast way to fix mistakes will be frustrating indeed!

The best way to cope with an error is to "make it didn't happen", to quote a common child's phrase. The same feedback methods discussed above are also effective in preventing the user from picking inappropriate combinations of objects and operations. Replacement of numeric type-ins with sliders or other visual controls eliminates the common "Range Error". The use of radio buttons prevents the user from picking incompatible options. When such techniques are used consistently, the beginner also gains confidence that he may explore the program without blundering into errors.

Once an error has occured, the best solution is to have an "inverse operation" immediately available. For instance, the way to fix a bad character is to hit the backspace key. If a line is inadvertantly deleted, there should be a way to restore it.

Sometimes the mechanics of providing true inverses are impractical, or end up cluttering the interface themselves. In these cases, a global "Undo" command should be provided to reverse the effect of the last operation, no matter what it was.


Of modes and bandwidths

Now I am going to depart from the Card, Newell and Moran thread of discussion to consider how we can minimize the number of operations in a task by altering the modes of the interface. Although "no modes" has been a watchword of Macintosh developers, the term may need definition for Atarians.

Simply stated, a mode exists any time you cannot get to all of the capabilities of the program without taking some intermediate step. Familiar examples are old-style "menu-driven" programs, in which user must make selections from a number of nested menus in order to perform any operation. The options of any one menu are unavailable from the others.

Recall that the user is trying to accomplish work in his own problem space, by altering its states. A mode in the program adds additional states to the problem space, which he is forced to consider in order to get the job done. We might call an interface which is completely modeless "transparent", because it adds no states between the user and his work. One of the best examples of a transparent program is the 15-puzzle in the Macintosh desk accessory set. The problem space of rearranging the tiles is identical between the program and a physical puzzle.

Unfortunately, most programmers find themselves forced to put modes of some sort into their programs. These often arise due to technological limitations, such as memory space, screen "real estate", or performance limitations of peripherals. The question is how the modes can be made least offensive.

I will make the general claim that the frustration which a mode produces is directly proportional to the amount of the user's bandwidth which it consumes. In other words, we need to consider how many keystrokes, mouse clicks, eye movements, and so on, are going into manipulating the true problem states, and how many are being absorbed by the modes of the program. If the interface is wasting a large amount of the user's effort, it will be perceived as slow and annoying.

Here we can consider again the hierarchy of goals and methods which the user employs. When the mode is low in the hierarchy, and close to the user's "fingertips", it is encountered the most frequently. For instance, consider how frustrating it would be to have to hit a function key before typing in each character!

The "menu-driven" style of programs mentioned above are almost as bad, since usually only one piece of information is collected at each menu. Such a program becomes a labyrinth of states better suited to an adventure game!

The least offensive modes are found at the higher, goal related levels of the hierarchy. The better they align with changes in the state of the original problem, the more they are tolerated. For example, a word processing program might have one screen layout for program editing, another for writing letters, and yet another while printing the documents. A multi-function business package might have one set of menus for the spreadsheet, another for a graphing module, and a third for a database.

In some cases the problem solved by the program has convenient "fracture lines" which can be used to define the modes. An example in my own past is the RCS, where the editing of each type of resource tree forms its own mode, with each of the modes nested within the overall mode and problem of composing the entire resource tree.


To do is to be!

Any narrative description of user interface is bound to be lacking. There is no way text can convey the vibrancy and tactile pleasure of a good interface, or the sullen boredom of a bad one. Therefore, I encourage you to experiment. Get out your favorite arcade game and see if you can spot some of the elements I have described. Dig into your slush pile for the most annoying program you have ever seen, run it and see if you can see mistakes. How would you fix them? Then... go do it to your own program!


Amen...

This concludes the sermon. I'd like some Feedback as to whether you found this Boring Beyond Belief or Really Hot Stuff. If enough people are interested, homily number two will appear a few episodes from now. The very next installment of ST PRO GEM will go back to basics to explore VDI drawing primitives. In the meantime, you might investigate some of the Good Books on interface design referenced below.


References

  • Stuart K. Card, Thomas P. Moran, and Allen Newell, THE PSYCHOLOGY OF HUMAN-COMPUTER INTERACTION, Lawrence Erlbaum Associates, Hillsdale, New Jersey, 1983. (Fundamental and indispensible. The volume of experimental results make it weighty. The Good Parts are at the beginning and end.)
  • "Macintosh User Interface Guidelines", in INSIDE MACINTOSH, Apple Computer, Inc., 1984. (Yes, Atarians, we have something to learn here. Though not everything "translates", this is a fine piece of principled design work. Read and appreciate.)
  • James D. Foley, Victor L. Wallace, and Peggy Chan, "The Human Factors of Computer Graphics Interaction Techniques", IEEE Computer Graphics (CG & A), November 1984, pp. 13-48. (A good overview, including higher level topics which I have postponed to a later article. Excellent bibliography.)
  • J. D. Foley and A. Van Dam, FUNDAMENTALS OF INTERACTIVE COMPUTER GRAPHICS, Addison Wesley, 1984, Chapters 5 and 6. (If you can't get the article above, read this. If you are designing graphics apps, buy the whole book! Staggering bibliography.)
  • Ben Schneidermann, "Direct Manipulation: A Step Beyond Programming Languages", IEEE Computer, August 1983, pp. 57-69. (What do Pacman and Visicalc have in common? Schneidermann's analysis is vital to creating hot interfaces.)