Professional GEM - Part VIII - User interfaces

From Atari Wiki
Revision as of 12:17, 13 September 2006 by Zorro 2 (talk | contribs)
Jump to navigation Jump to search


        Professional GEM                                               57


                                   PART VIII

                              USER INTERFACES



       AND NOW FOR SOMETHING COMPLETELY DIFFERENT

             In response  to a number of requests, this installment of ST
        PRO  GEM  will be devoted to examining a few of the principles of
        computer/human  interface  design,  or  "religion"  as some would
        have  it.   I'm going to start with basic ergonomic laws, and try
        to  draw  some conclusions which are fairly specific to designing
        for  the  ST.   If  this  article  meets  with  general approval,
        further  "homilies"  may appear at irregular intervals as part of
        the ST PRO GEM series.  

             For those  who  did NOT ask for this topic, it seems fair to
        explain  why  your  diet  of  hard-core technical information has
        been  interrupted by a sermon!  As a motivater, we might consider
        why  some  programs  are  said  by reviewers to have a "hot" feel
        (and   hence   sell   well!)  while  others  are  "confusing"  or
        "boring".  

             Alan Kay  has  said  that  "user  interface  is theatre".  I
        think  we  may  be  able  to  take it further, and suggest that a
        successful  program  works a bit of magic, persuading the user to
        suspend  his  disbelief  and  enter an imaginary world behind the
        screen,  whether  it  is the mathematical world of a spreadsheet,
        or the land of Pacman pursued by ghosts.  

             A reader  of  a novel or science fiction story also suspends
        disbelief  to  participate  in  the work.  Bad grammar and clumsy
        plotting   by   the  author  are  jarring,  and  break  down  the
        illusion.   Similarly, a programmer who fails to pay attention to
        making  his  interface  fast  and consistent will annoy the user,
        and  distract  him  from  whatever  care has been lavished on the
        functional core of the program.  


        CREDIT WHERE IT'S DUE ?

             Before launching  into  the  discussion of user interface, I
        should  mention  that  the  general  treatment  and  many  of the
        specific  research  results  are  drawn  from  Card,  Newell, and
        Moran's  landmark book on the topic, which is cited at the end of
        the  article.   Any  errors  in interpretation and application to
        GEM and the ST are entirely my own, however.  






        


        Professional GEM            Part VIII                          58




        FINGERTIPS

             We'll start  right  at  the  user's  fingers  with the basic
        equation  governing  positioning  of the mouse, Fitt's Law, which
        is given as 

                            T = I * LOG2( D / S + .5)

        where  T  is  the  amount  of  time to move to a target, D is the
        distance  of  the  target from the current position, and S is the
        size  of  the  target,  stated  in equivalent units.  LOG2 is the
        base  2  (binary)  logarithm function, and I is a proportionality
        constant,  about  100  milliseconds per bit, which corresponds to
        the human's "clock rate" for making incremental movements.  

             We can  squeeze an amazing amount of information out of this
        formula  when  attempting to speed up an interface.  Since motion
        time  goes  up  with  distance, we should arrange the screen with
        the  usual  working  area near the center, so the mouse will have
        to  move  a smaller distance on average from a selected object to
        a  menu  or  panel.   Likewise,  any items which are usually used
        together should be placed together.  

             The most  common  operations will have the greater impact on
        speed,  so they should be closest to the working area and perhaps
        larger  than  other  icons  or menu entries.  If you want to have
        all  other  operations take about the same time, then the targets
        farthest  from  the  working  area  should  be  larger, and those
        closer may be proportionately smaller.  

             Consider also  the  implications  for  dialogs.  Small check
        boxes  are  out.   Large  buttons  which  are easy to hit are in.
        There  should  be  ample  space between selectable items to allow
        for  positioning  error.   Dangerous  options  should  be  widely
        separated from common selections.  


        MUSCLES

             Anyone who  has  used  the ST Desktop for any period of time
        has  probably noticed that his fingers now know where to find the
        File  menu.  This phenomenon is sometimes called "muscle memory",
        and its rate of onset is given by the Power Law of Practice: 

                             T(n) = T(1) * n ** (-a)

        where  T(n) is the time on the nth trial, T(1) is the time on the
        first  trial,  and  a is approximately 0.4.  (I have appropriated
        **  from  Fortran  as  an  exponentiation operator, since C lacks
        one.) 


        


        Professional GEM            Part VIII                          59



             This first  thing  to  note  about  the Power Law is that it
        only  works  if a target stays in the same place!  This should be
        a  potent  argument  against rearranging icons, menus, or dialogs
        without  some  explicit  request  by the user.  The time to hit a
        target which moves around arbitrarily will always be T(1)! 

             In many  cases,  the  Power Law will also work for sequences
        of  operations  to  even  greater  effect.   If  you  are a touch
        typist,  you  can  observe  this effect by comparing how fast you
        can  enter  "the"  in  comparison to three random letters.  We'll
        come  back  shortly  to consider what we can do to encourage this
        phenomenon.  


        EYES

             Just as  fingers  are  the  way  the  user sends data to the
        computer,  so  the  eyes  are  his channel from the machine.  The
        rate   at  which  information  may  be  passed  to  the  user  is
        determined   by   the  "cycle  time"  of  his  visual  processor.
        Experimental  results  show  that this time ranges between 50 and
        200 milliseconds.  

             Events separated  by  50  milliseconds  or  less  are always
        perceived  as  a  single event.  Those separated by more than 200
        milliseconds  are  always  seen  as  separate.   We can use these
        facts  in  optimizing  user  of the computer's power when driving
        the interface.  

             Suppose your  application's interface contains an icon which
        should  be  inverted  when the mouse passes over it.  We now know
        that  flipping  it  within one twentieth of a second is necessary
        and  sufficient.   Therefore,  if  a  "first  cut" at the program
        achieves   this   performance,  there  is  no  need  for  further
        optimization,  unless  you  want  to interleave other operations.
        If  it  falls  short,  it  will  be necessary to do some assembly
        coding to achieve a smooth feel.  

             On the  other  hand,  two  actions  which you want to appear
        distinct  or  convey  two different pieces of information must be
        separated  by  an  absolute  minimum of a fifth of a second, even
        assuming  that  they  occur in an identical location on which the
        user's attention is already focused.  

             We are  able  to influence the visual processing rate within
        the  50 to 200 millisecond range by changing the intensity of the
        stimulus  presented.   This can be done with color, by flashing a
        target,  or  by  more subtle enhancements such as bold face type.
        For  instance,  most  people  using GEM soon become accustomed to
        the  "paper  white"  background  of  most windows and dialogs.  A
        dialog  which  uses  a  reverse  color  scheme,  white letters on


        


        Professional GEM            Part VIII                          60


        black,   is   visually   shocking  in  its  starkness,  and  will
        immediately draw the user's eyes.  

             It should  be  quickly  added that stimulus enhancement will
        only  work  when  it unambiguously draws attention to the target.
        Three  or  four  blinking objects scattered around the screen are
        confusing, and worse than no enhancement at all! 


        SHORT-TERM MEMORY

             Both the  information  gathered  by  the  eyes  and movement
        commands  on their way to the hand pass through short-term memory
        (also  called  working  memory).  The amount of information which
        can  be  held  in  short-term  memory at any one time is limited.
        You  can demonstrate this limit on yourself by attempting to type
        a  sheet  of  random  numbers  by looking back and forth from the
        numbers  to the screen.  If you are like most people, you will be
        able  to  remember  between  five and nine numbers at a time.  So
        universal  is this finding that it is sometimes called "the magic
        number seven, plus or minus two".  

             This short-term  capacity  sets  a  limit  on  the number of
        choices  which  the  user  can  be expected to grasp at once.  It
        suggests  that  the  number of independent choices in a menu, for
        instance,  should  be  around  seven,  and never exceed nine.  If
        this  limit  is violated, then the user will have to take several
        glances, with pauses to think, in order to make a choice.  


        CHUCKING

             The effective   capacity   of   short-term   memory  can  be
        increased  when  several  related items are mentally grouped as a
        "chunk".   Humans  automatically  adopt  this  strategy  to  save
        themselves  time.   For  instance,  random numbers had to be used
        instead  of text in the example above, because people do not type
        their  native  language  as individual characters.  Instead, they
        combine   the  letters  into  words  and  remember  these  chunks
        instead.    Put   another  way,  the  characters  are  no  longer
        considered as individual choices.  

             A well   designed   interface  should  promote  the  use  of
        chunking  as  a  strategy by the user.  One easy way is to gather
        together  related  options in a single place.  This is one reason
        that  like  commands  are  grouped  into  a  single menu which is
        hidden  except  for  its  title.  If all of the menu options were
        "in  the  open",  the  user  would  be overwhelmed with dozens of
        alternatives  at  once.   Instead,  a  "Show  Info"  command, for
        instance, becomes two chunks: pick File menu, then pick Show.  




        


        Professional GEM            Part VIII                          61



             Sometimes the  interface can accomplish the chunking for the
        user.   Consider  the  difference  between  a slider bar in a GEM
        program,   and   a  three  digit  entry  field  in  a  text  mode
        application.   Obviously,  the  GEM  user  has fewer decisions to
        make in order to set the associated variable.  


        THINK !

             While we  are  puttering  around  trying  to  speed  up  the
        keyboard,  the mouse, and the screen, the user is actually trying
        to  get some work done.  We need to back off now, and look at the
        ways   of   thinking,   or  cognitive  processes,  that  go  into
        accomplishing the job.  

             The user's  goal  may  be  to  enter  and  edit a letter, to
        retrieve  information  from a database, or simply draw a picture,
        but  it  probably  has  very  little  to do with programming.  In
        fact,  the  Problem  Space  Principle  says  that the task can be
        described  as  a  set  of states of knowledge, a set of operators
        and  associated  constraints  for  changing  the  states, and the
        knowledge  to  choose  the appropriate operator, which resides in
        the user's head.  

             Those with  a background in systems theory can consider this
        as  a  somewhat abstract, but straightforward, statement in terms
        of  state  variables  and  operators.  A programmer might compare
        the  knowledge  states  to the values of variables, the operators
        to  arithmetic and logic operations, the constraints to the rules
        of  syntax, and the user's knowledge to the algorithm embodied by
        a program.  


        ARE WE NOT MEN ?

             A rational  person will try to attain his goals (get the job
        done)  by  changing  the  state  of  his  problem  space from its
        initial  state  to  the  goal  state.   The  initial  state,  for
        instance,  might  be  a blank word processor screen.  The desired
        final  state  is  to  have  a  completed  business  letter on the
        screen.  

             The Rationality  Principle  says that the user's behavior in
        typing,  mousing,  and so on, can be explained by considering the
        tasks  required  to  achieve the goal, the operators available to
        carry   out   the  tasks,  and  the  limitations  on  the  user's
        knowledge,  observations,  and  processing capacity.  This sounds
        like  the  typical  user  of a computer program must spend a good
        deal  of  time scratching his head and wondering what to do next.
        In  fact, one of Card and Moran's key results is that this is NOT
        what takes place.  


        


        Professional GEM            Part VIII                          62



             What happens,  in  fact,  is that the trained user strikes a
        sort  of  "modus  vivendi"  with  his  tool  and  adopts a set of
        repetitive,  trained behavior patterns as the best way to get the
        job  done.   He  may go so far as to ignore some functions of the
        program  in  order  to  set  up  a reliable pattern.  What we are
        looking  for  is  a way of measuring and predicting the "quality"
        of  this  trained  behavior.   Since  using  computers is a human
        endeavor,  we  should  consider not only the speed with which the
        task  is  completed,  but  the  degree  of  annoyance or pleasure
        associated with the process.  

             Card and  Moran  constructed  a  series of behavioral models
        which        they       called       GOMS       models,       for
        Goals-Operators-Methods-Selection.   These  models suggested that
        in  the  training  process  the user learned to combine the basic
        operators  in  sequences  (chunks!) which then became methods for
        reaching  the  goals.   Then  these  first level methods might be
        combined  again  into  second level methods, and so forth, as the
        learning progressed.  

             The GOMS  models  were  tested in a lengthy series of trials
        at  Xerox  PARC  using  a  variety  of  word processing software.
        (Among  the  subjects  of these experiments were the inventors of
        the  windowing  methods  used  in  GEM!)   The results were again
        surprising:  the  level  of  detail  in  the  models  was  really
        unimportant! 

             It turned  out  to  be  sufficient  to  merely  count up the
        number  of  keystrokes,  mouse  movements,  and thought intervals
        required  by  each  task.  After summing up all of the tasks, any
        extra  time  for the computer to respond, or the user to move his
        hands  from  keyboard  to  mouse,  or eyes from screen to printed
        page  is  added  in.   This  simplified  version  is  called  the
        Keystroke-Level Model.  

             As an  example  of the Keystroke Model, consider the task of
        changing   a  mistyped  letter  on  the  screen  of  a  GEM  word
        processor.   This  might  be  broken down as follows: 1) find the
        letter  on the screen; 2) move hand to mouse; 3) point to letter;
        4)  click  mouse  button;  5)  move  hand  to keyboard; 6) strike
        "Delete" key; 7) strike key for new character.  

             The sufficiency  of  the  Keystroke  Model is great news for
        our  attempt  to  design  faster  interfaces.   It  says  we  can
        concentrate  our  efforts  on  minimizing  the  number  of  total
        actions  to be taken, and making sure that each action is as fast
        as  possible.   We  have  already discussed some ways to speed up
        the  mouse  and  keyboard  actions,  so let's now consider how to
        speed up the thought intervals, and cut the number of actions.  




        


        Professional GEM            Part VIII                          63



             One way  to  cut  down "think time" is to make sure that the
        capacity  of  short-term memory is not exceeded during the course
        of  a  task.   For example, the fix-a-letter task described above
        required  the user to remember 1) his place in the overall job of
        typing  the  document;  2)  the  task  he is about to perform; 3)
        where  the  bad character appeared, and 4) what the new character
        was.   When  this  total  of  items creeps toward seven, the user
        often loses his place and commits errors.  

             You can   appreciate   the   ubiquity  of  this  problem  by
        considering  how  many  times  you  have  made  mistakes  nesting
        parentheses,  or  had  to go back to count them, because too many
        things  happened  while  typing  the line to remember the nesting
        levels.  The  moral  is  that  operations  with  long  strings of
        operands should be avoided when designing an interface.  

             The single  most  important  factor  in  making an interface
        comfortable   to   use  is  increasing  its  predictability,  and
        decreasing  the  amount of indecision present at each step during
        a  task.   There  is  (inevitably) an Uncertainty Principle which
        relates  the  number  of  choices  at each step to the associated
        time for thought: 

                              T = I * LOG2 ( N + 1)

        where  LOG2  is the binary logarithm function, N is the number of
        equally  probable  choices,  and I is a constant of approximately
        140  msec/bit.  When the alternates are not equally probable, the
        function is more complex: 

           T = I * SUM-FOR-i-FROM-1-TO-N (P(i) * LOG2( 1 / P(i) + 1) )

        where  the  P(i)  are  the  probabilities  of each of the choices
        (which  must sum to one).  (SUM-FOR-i... is the best I can do for
        a  sigma  operator  on-line!)  Those of you with some information
        theory  background  will recognize this formula as the entropy of
        the decision; we'll come back to that later.  

             So what  can  we  learn from this hash?  It turns out, as we
        might  expect,  that  we can decrease the decision time by making
        some  of  the  user's  choices  more probable than others.  We do
        that by means of feedback cues from the interface.  

             The important  of  reliable,  continuous meaningful feedback
        cannot  be  emphasized  enough.   It helps the beginner learn the
        system,  and its predictability makes the program comfortable for
        the  expert.   Programs  with  no  feedback,  or unreliable cues,
        produce confusion, dissonance, and frustration in the user.  

             This principle  is so important that I going to give several
        examples  from common GEM practice.  The Desktop provides several


        


        Professional GEM            Part VIII                          64


        instances.   When  an  object  is selected and a menu drops down,
        only  those  choices which are legal for the object are in black.
        The  others  are  dimmed  to grey, and are therefore removed from
        the  decision.   When a pick is made from the menu, the bar entry
        remains  black  until  the  operation is complete, reassuring the
        user  that  the correct choice was made.  In both the Desktop and
        the  RCS,  items  which  are  double-clicked open up with a "zoom
        box"  from  the  object,  again showing that the right object was
        picked.  

             Other techniques  are useful when operator icons are exposed
        on  the  screen.   When an object is picked, the legal operations
        might  be  outlined,  or the bad choices might be dimmed.  If the
        screen  flashing  produced  by  this  is objectionable, the legal
        icons  can  be made mouse sensitive, so they will "light up" when
        the  cursor  passes  over  - again showing the user which choices
        are legal.  

             The desire  for  feedback  is  so  strong  that it should be
        provided  even  while  the  computer is doing an operation on its
        own.   The  hour glass mouse form is a primitive example of this.
        More  sophisticated  are  "progress  indicators" such as animated
        thermometer  bars,  clocks,  or  text  displays of the processing
        steps.   The  ST Desktop provides examples in the Format and Disk
        Copy  functions.   The purpose of all of these is to reassure the
        user  that the operation is progressing normally.  Their lack can
        lead  to  amusing  spectacles such as secretaries leaning over to
        hear if their disk drives are working! 

             Another commonly  overlooked feature is error prevention and
        correction.   Card and Moran's results showed that in order to go
        faster,  people  will  tolerate error rates of up to 30% in their
        work.   Any  program  which  does  not  give  a  fast  way to fix
        mistakes will be frustrating indeed! 

             The best  way  to  cope  with an error is to "make it didn't
        happen",  to  quote  a  common child's phrase.  The same feedback
        methods  discussed  above  are  also  effective in preventing the
        user  from  picking  inappropriate  combinations  of  objects and
        operations.   Replacement  of  numeric  type-ins  with sliders or
        other  visual  controls eliminates the common "Range Error".  The
        use  of radio buttons prevents the user from picking incompatible
        options.    When  such  techniques  are  used  consistently,  the
        beginner  also  gains  confidence that he may explore the program
        without blundering into errors.  

             Once an  error  has occured, the best solution is to have an
        "inverse  operation"  immediately  available.   For instance, the
        way  to  fix  a  bad character is to hit the backspace key.  If a
        line  is  inadvertantly deleted, there should be a way to restore
        it.  



        


        Professional GEM            Part VIII                          65



             Sometimes the  mechanics  of  providing  true  inverses  are
        impractical,  or  end up cluttering the interface themselves.  In
        these  cases,  a  global  "Undo"  command  should  be provided to
        reverse  the  effect  of  the  last  operation, no matter what it
        was.  


        OF MODES AND BANDW

             Now I  am  going  to  depart from the Card, Newell and Moran
        thread  of  discussion to consider how we can minimize the number
        of  operations  in a task by altering the modes of the interface.
        Although   "no   modes"   has   been  a  watchword  of  Macintosh
        developers, the term may need definition for Atarians.  

             Simply stated,  a mode exists any time you cannot get to all
        of   the   capabilities   of  the  program  without  taking  some
        intermediate    step.     Familiar    examples    are   old-style
        "menu-driven"  programs,  in which user must make selections from
        a  number of nested menus in order to perform any operation.  The
        options of any one menu are unavailable from the others.  

             Recall that  the  user  is  trying to accomplish work in his
        own  problem  space,  by  altering  its  states.   A  mode in the
        program  adds additional states to the problem space, which he is
        forced  to  consider in order to get the job done.  We might call
        an  interface which is completely modeless "transparent", because
        it  adds  no  states  between  the user and his work.  One of the
        best  examples  of  a transparent program is the 15-puzzle in the
        Macintosh  desk  accessory set.  The problem space of rearranging
        the  tiles  is  identical  between  the  program  and  a physical
        puzzle.  

             Unfortunately, most  programmers  find  themselves forced to
        put  modes  of  some sort into their programs.  These often arise
        due  to  technological  limitations, such as memory space, screen
        "real  estate",  or  performance limitations of peripherals.  The
        question is how the modes can be made least offensive.  

             I will  make  the general claim that the frustration which a
        mode  produces  is  directly  proportional  to  the amount of the
        user's  bandwidth  which it consumes.  In other words, we need to
        consider  how  many  keystrokes, mouse clicks, eye movements, and
        so  on,  are going into manipulating the true problem states, and
        how  many are being absorbed by the modes of the program.  If the
        interface  is  wasting  a  large  amount of the user's effort, it
        will be perceived as slow and annoying.  

             Here we  can  consider  again  the  hierarchy  of  goals and
        methods  which  the  user  employs.   When the mode is low in the
        hierarchy,   and   close   to  the  user's  "fingertips",  it  is


        


        Professional GEM            Part VIII                          66


        encountered  the  most  frequently.   For  instance, consider how
        frustrating  it  would  be  to  have to hit a function key before
        typing in each character! 

             The "menu-driven"  style  of  programs  mentioned  above are
        almost  as  bad,  since  usually only one piece of information is
        collected  at  each  menu.  Such a program becomes a labyrinth of
        states better suited to an adventure game! 

             The least  offensive  modes  are  found  at the higher, goal
        related  levels  of  the  hierarchy.   The better they align with
        changes  in  the state of the original problem, the more they are
        tolerated.   For  example,  a  word processing program might have
        one  screen  layout  for  program  editing,  another  for writing
        letters,  and  yet  another  while  printing  the  documents.   A
        multi-function  business  package might have one set of menus for
        the  spreadsheet,  another for a graphing module, and a third for
        a database.  

             In some   cases  the  problem  solved  by  the  program  has
        convenient  "fracture  lines"  which  can  be  used to define the
        modes.   An  example in my own past is the RCS, where the editing
        of  each  type  of resource tree forms its own mode, with each of
        the   modes  nested  within  the  overall  mode  and  problem  of
        composing the entire resource tree.  


        TO DO IS TO BE !

             Any narrative  description  of user interface is bound to be
        lacking.   There  is  no  way  text  can  convey the vibrancy and
        tactile  pleasure of a good interface, or the sullen boredom of a
        bad  one.   Therefore,  I  encourage  you to experiment.  Get out
        your  favorite  arcade  game  and see if you can spot some of the
        elements  I  have  described.   Dig  into your slush pile for the
        most  annoying  program you have ever seen, run it and see if you
        can  see  mistakes.  How would you fix them?  Then... go do it to
        your own program! 


        AMEN...

             This concludes  the  sermon.   I'd  like some Feedback as to
        whether  you  found  this  Boring  Beyond  Belief  or  Really Hot
        Stuff.   If  enough people are interested, homily number two will
        appear  a  few  episodes from now.   The very next installment of
        ST  PRO  GEM  will  go  back  to  basics  to  explore VDI drawing
        primitives.   In  the meantime, you might investigate some of the
        Good Books on interface design referenced below.  





        

Back to Professional_GEM