Printable Version of this PageHome PageRecent ChangesSearchSign In


Behavioral Evaluation and Analysis of a Hypertext Browser, Egan, Remde, Landauer, Lochbaum, and Gomez

This paper details a user study of a hypertext brower called SuperBook (developed by 3 of the authors). In this study, 20 users were split between a paper text (a manual for "S", a statistics program) and the SB. They were evaluated on four different "tasks": Structured search, open-book essays, incidental learning, and subjective ratings. In the structured task, they were asked a series of questions. The answers to these questions were found in the text, in the headings, in the headings and text, or neither the headings nor text. In the open-book essays they wrote, well, essays. Incidental learning asked them to recall various facts about the book which they were never specifically questioned about. Subjective opinions were just that.

Analysis of the structured search tasks indicates that users found the correct answer more often with the SB, but it took them longer to do so, except when the answer was contained in the text. (This makes sense since the browser includes enhanced search functionality.) Analysis of the essays indicated that the users of the SB wrote better essays, but the authors offer no reason for this beyond the use of the SB. We speculate the increased search functionality provides a "laundry list" of related topics making it easier for the students to investigate and
include other topics while writing. Analysis revealed the users recalled more incidental material while using the SB (probably due to have the headings/ToC constantly displayed while browsing) but we're not convinced this "incidental knowledge" was anything useful. Some of the incidental info could have been useful ("S" commands) and some not useful (headings
in the ToC). The authors don't present a breakdown of the recall of the users. Subjective questions indicated users like the SB more.

Bringing Icons to Life, Baecker, Small, and Mander
BibTex: baeckericons

This paper summaries a study about animating icons. The authors used an old Mac tool called HyperCard which was similar to MS Paint. They animated 18 icons with 16 frames each. They ran a relatively detailed study to evaulate the effectiveness of the animations. The participants consisted of only 9 subjects: 4 with no experience, 4 with familiarity, and 1 "expert". They quizzed the users about the icons function with the static icons and again after viewing the animations. They also had the users complete drawing tasks using the icons (presumably with only
the animations to guide them w.r.t. functionality. The researchers talk about the subjects "comprehending" the function of the tools better with the animations, but it's unclear exactly what the researchers took to be comprehension (key word, key concept, etc). The limited number of true novices would also be a hinderance.

An interesting point to this study is that the researchers noted that the novices were stil confused as to exactly how to use the tool, even with the animations. When the animation shows the mouse drawing a box, when does the tool action depend on a mouse click? For ex. the correct
procedure for selecting something would be click mouse, drag, release mouse. The researchers concluded that adding sounds to their animations would enhance the understandability of the animations.

This study also made no claims about how to animate icons (optimal number of frames, duration of animation, etc).

Using GOMS for User Interface Design and Evaluation: Which Technique? Bonnie John and David Kieras
BibTex: johngomscompairson

This paper explains which GOMS technique fits different design situations. It only details GOMS techniques which have reached sufficient maturity: CMN-GOMS (the original Card, Moran, Newell),
Critical Path Method GOMS (CPT-GOMS), KLM, and NGOMSL.

This paper makes the argument that GOMS can provide large amounts of design information before or very early in the design cycle. It can help determine coverage and consistency in functionality, help
determine operator sequence, execution time, learning time, error recovery support, and informal understanding of a design. John makes the argument that the fact that GOMS is aimed at expert users in repetitive tasks is not really a limitation. I don't entirely buy that, but I do believe it is a powerful predictive tool. She also presents the case that GOMS is not as difficult as people make it out to be. To which I say, Yes easy for her to say. For simple interfaces/task, I believe her. It becomes much less clear for complex tasks.

She presents a number of case studies (published and unpublished) which detail implementations of some GOMS and the benefits of doing so. Many of these tasks are comparative. A new system/interface/process is being compared to a previous one. I think this is the real strength of GOMS.

The GOMS Family of User Interface Analysis Techniques:Comparison and Contrast Bonnie John and David Kieras
BibTex: johngomsfamily

All GOMS techniques make critical assumption that users are familiar with task. But they all place the "mental" portion of the activites at different points in the measuring process.

KLM - not good for completely new devices where metrics unknown.

NGOMSL and CMN-COMS build in processing time in "verify" stage. I don't buy this.

ngomsl uses cognitive theory (CCT) Grammar-like. Uses production rules. assumes users know the operators.

GOMS Meets the Phone Company: Analytic Modeling Applied to Real-World Problems, Gray, John, Stuart, and Lawrence
BibTex: graygomsphone

This paper was extremely influential because it proved GOMS could be a useful model, applied to real world problems. Before this, it was apparently applied to small, theoretical problems.

The general scenario in this paper is the time to complete an operator assisted call. A company marketed their systems (10k at the time) on the premise that it could shave 2.5 sec./call (worth millions). Before switching systems, the phone company decided to compare the new and old
systems and observe the faster time call for themselves. Many operators were highly motivated to learn the new system and volunteered to pilot it. Much to the surprise of everyone, the new system was actually significantly slower. Someone somehow decided to run GOMS on the process to see what was going on (how does the thought to run GOMS occur here??). Essentially, GOMS proved that the new system did not eliminate keystrokes; it merely moved them elsewhere along the path. Additionally, it was proven that the critical path time was being driven not by keystrokes, but by how fast the customers could relay the number to call or bill.

I also think (need to check this..) that this was the first time GOMS had been applied to a task not wholly confined to the computer (such as text editing or spreadsheets). This task involved interactions between people, computer entry, and system interactions.

Movement Time Prediction in Human-Computer Interfaces, I. Scott MacKenzie
BibTex: mackenziemovement

Fitt's Law paper. Tracks development of Fitt's law from ID=log[2](2A/W) to ID=log[2](2A/W + 1) and why that is necessary. Also details extensions into 2D (which is "width" for rectangles
oriented differently or approached from different angles?). Also notes that speed vs. accuracy (%errors) should be taken into consideration when determining the ideal width of targets. Applies Fitt's Law to a practical example of the 3 ways to delete on icon on a Mac and shows empirically which is more efficient.

Keystroke Level Model (Ch.8 Psychology of HCI book),
Card, Moran, Newell

BibTex: cardmorannewell
Details the "quick and dirty" version of GOMS, the KLM. Consists of 4 physical-motor operators: K (keystroking), P (pointing), H (homing), and D (drawing), one mental operator M, and a system response operator

So, total execution time of a task is:
T[execute] = T[k] + T[p] + T[h] + T[d] + T[m] + T[r]
or some combination thereof.

In general,
T[k] is user dependent
T[p] is 1.1 sec (seems arbitrary, but what the hey...)
T[h] .4 sec (based on empirical studies)
T[d] is very specialized and task dependent, but is a linear function
of num. segments and total length of segments)
T[m] is user and task dependent
T[r] is system and task dependent

Also includes descriptions of studies which examined times to do tasks with different editors. Did empirical KLM study and compared with empirical results. Should be noted that KLM only good for expert users who know task; not for inexpert users or "creative" tasks.

Human Information-Processor (Ch.2a Psychology of HCI book), Card, Moran, Newell
BibTex: cardmorannewell

This is the theory that "summarizes" the knowledge about the human psychological functioning. The Model Human Processor has 3 subsystems (each w/ own memories and processors):

(the best was of summarizing this is really the figure on pg. 26)

1) Perceptual System
sensors (eyes/ears/etc)
Visual Image Store Buffer (half-life 200 msec)
Auditory Image Store Buffer (half-life 1500 msec)
2) Motor System

3) Cognitive System
Working Memory, Long-Term Memory
refers to process of chunking and chaining memory for indexing purposes. Asserts LTM never forgotten, just lost association to activate.

Direct Manipulation Interfaces, Hutchins, Hollan, and Norman
BibTex: directmanip

Designing for Usability:Key Principles and what Designers Think, John Gould and Clayton Lewis
Presents user study which enumerates design principles. 447 people. Makes the case that the 4 (3??) design principles they list should be key. However, testers didn't come up with them, leading the researchers to conclude that the principles are not intuitive. Very, very ad-hoc study with no real grounding. Principles are:
  1. Early focus on users and tasks
  2. Empirical Measurement
  3. Iterative Design

Bigger research contribution is catalogue of evaluation techniques and ideas (wiz of oz, cog walkthroughs, talk aloud, etc) and excuses designers typically make and how to refute them.

Conclusion that all design of app should be done in small group 10-15 is unrealistic. Especially in business environment.
Good point about user testing will be done somewhere (whether in lab or by customers in field after release).
Tried to portray usability as ultimate standard - case isn't made in paper or in history (see MS, Apple, etc).

The Anti-Mac Interface, Don Gentner and Jakob Nielson

Paper explores what might happen if we disregarded several inviolable design practices first popularized by Apple and their interface design guidelines.
Guidelines are:
  1. metaphors (reality)
  2. direct manipulation (delegation)
  3. see and point (describe and command)
  4. consistency (diversity)
  5. WYSIWYG (represent meaning)
  6. user control (shared control)
  7. feedback and dialog (system handles details)
  8. forgiveness (model user actions)
  9. aesthetic integrity (graphic variety)
  10. modelessness (richer cues)

Interesting to note that many of these could be good design principles for AT products/designs.

They also have a table/chart showing the original Mac (or situation) that the guidelines were developed for and how that has changed.

Interface Metaphors and User Interface Design, Carroll, Mack, Kellogg
BibTex: carrollmetaphors

Three approaches to metaphors:
  1. operational - what happnes when people apply metaphors
  2. structural - metaphor broken down into primitive pieces; relations between source to target rather than properties. Focuses on task and metaphor.
  3. pragmatic - focuses on mismatches and why humans choose certain compairisons given a situation or task. "real world"
pg. 75 good summary of types

compositite metaphors - many metaphors for one situation

theory: 3 stages of interaction w/ metaphor: instantiation, elaboration, consolidation (putting pieces together into model)

Designing w/ Metaphors 4 pieces:
  1. identification (predecessor tools/systems, human propensities, sheer invention)
  2. detail matches
  3. identify mismatches (too much and too little)
  4. design around mismatches (good example of how to design around mismatches on pg 81)

Contextual Design: Principles and Practice, Karen Holtzblatt and Hugh Beyer
BibTex: holtzblattcontextual

Bridge between ethnography and design. Guide from beginning (interviews, customer data) to design and implementation.


  • Creating the project
    Defining the problem
    Defining project membership
    Defining the process
  • Understanding the customer
    Contextual Inquiry (apprenticeship)
    Interpretation Session - apprentice recounts for others
    Work Modeling (Context models, physical models, flow models, sequence models, artifact models)
  • Consolidation across users
    Affinity diagrams
    Work Model Consolidation
    Design room
  • Systemic Design
    Seperating conversations
    User Environment design
    Iteration with paper mockups

Making Work Visible, Lucy Suchman
BibTex: suchmanworkvisible

Rant against people who try to design systems w/o considering who will be using them. Makes case for ethnographic methods as essential to design process.

Extends that to the idea of a representation of work. Very fuzzy on what form representation should take. Posits that representations give a method to manipulate work and communicate about work. Similar to Star's "boundary object" work in that respect, I think.

If We Build it, They Will Come: Designing Information SYstems that People Want to Use, M. Lynne Markus and Mark Keil
BibTex: markuspeoplewant

Company had problem delivering error free systems when working with
configuring systems for customers. Made new system. No one used.
Thought, "bad HCI". Redesigned again and still no one used. So stepped
back and took a bigger look. Disconnect between manufacturing and sales

Why failed:
System took longer.

No motivation for sales reps to use (takes longer). Helped
manufacturing people rather than sales.

No reward structure (less mistakes == raise) or something.

Radical Solutions:
Change business model (less config options, "bundles")
Restructure rewards system of company.

System soln:
Integrate w/ price quote mechanism.

Take away points:
Random application of design guidelines not good.
User participation is necessary in the design process.

Designing with Ethnography: Make Work Visible, John Hughes, Ian Somerville, Richard Bentley, and Dave Randall
BibTex: hughesworkvisible

Similar to other somerville paper (in ethn. section), but begins to detail air traffic control work. Visual representation of work in form of paper strips w/ plane info. Highlights some of the problems of ethnographers/sociologists working with designers/system builders, but not as in-depth as other article.

Distributed Cognition: Toward a New Foundation for Human-Computer Interaction Research, James Hollan, Edwin Hutchins, and David Kirsh
BibTex: dicoghollan

DiCog holds that peoples' actions cannot be separated from the surroundings around them. Artifacts play crucial role (partly responsible for) the way people use them for work/play/action.

This paper champions a methodology which includes fundamentals of dicog. Fig 1. provides a diagram of an "integrated research methodology". Basically
  1. Distributed cognition
  2. Experiment
  3. Ethnography
  4. Work materials
  5. Workplaces

Cognitive Walkthroughs: A Method for Theory–based evaluation of User Interfaces, Polson, Lewis, Rieman, and Wharton

Exhaustive resource paper on cognitive walkthroughs. Discount method (in terms of time w/ users and development time). Method itself is very time consuming and repetitive.

Training Wheels in a User Interface, John Carroll and Caroline Carrithers
BibTex: trainingwheelscarroll

Ease people into interface/system. Progressive.
When do training wheels come off?
Small group
Mental model to project task onto
How to determine basic features vs. advanced
Transition w/ warnings? Prevent vs. warn
Constrained task really applicable?
WP program standard for the day - not necessarily overly complex
Exploratory environment - users felt safe to try things
Learning styles - plodder or brute force

The Spreadsheet Interface: A Basis for End User Programming, Bonnie Nardi and James Miller
BibTex: spreadsheetnardi

Ethnography influence from Nardi
Only cite 2 people. Not really a study. Observations.
Allows users to be productive immediately.
Abstracts the programming concepts
Users don't have to know all functionality or low level details to be
User population not described well. - What about non-techie people?
Just calculating a mortgage, making a gradebook, etc?
Most users start out with general conceptual problem and use spreadsheet
as tool to understand.
First use in end-user program?
Flexible in display but "rigid" in model.

Last modified 14 March 2005 at 3:06 pm by Valerie Henderson