CS 449/649: Human-Computer Interaction

Jian Zhao

Estimated study time: 1 hr 9 min

Table of contents

Sources and References

Primary textbook — Sharp, H., Preece, J., & Rogers, Y. Interaction Design: Beyond Human-Computer Interaction (5th ed., Wiley, 2019).

Supplementary texts — Norman, D. The Design of Everyday Things (Revised and expanded ed., Basic Books, 2013); Brown, T. Change by Design: How Design Thinking Transforms Organizations and Inspires Innovation (Revised ed., Harper Business, 2019); Cooper, A., Reimann, R., Cronin, D., & Noessel, C. About Face: The Essentials of Interaction Design (4th ed., Wiley, 2014).

Online resources — Nielsen Norman Group (nngroup.com) articles on usability, UX research, and design heuristics; Interaction Design Foundation (interaction-design.org) encyclopedia and method guides; Stanford d.school design thinking resources; Georgia Tech CS 6750 HCI course materials (Dr. David Joyner); MIT 6.831 User Interface Design and Implementation lecture notes; CMU 05-391 HCI course materials; UC Berkeley CS 160 User Interface Design resources; Erik Kennedy, “7 Rules for Creating Gorgeous UI” (learnui.design).


Chapter 1: Foundations of Human-Computer Interaction

1.1 What Is HCI?

Human-Computer Interaction is the study, design, and evaluation of the interfaces between people and computers. The field draws on computer science, cognitive psychology, design, and the social sciences to understand how people use technology and how to create systems that are effective, efficient, and satisfying. HCI emerged as a distinct discipline in the early 1980s when personal computing made interface design a practical concern for millions of users rather than a niche problem for trained operators.

The scope of HCI extends far beyond traditional desktop software. Modern HCI encompasses mobile devices, wearable technology, voice interfaces, augmented and virtual reality, tangible computing, and AI-powered systems. As computing becomes embedded in everyday objects and environments, the boundaries of HCI continue to expand.

1.2 UI versus UX

Two terms that frequently arise in HCI discussions are user interface (UI) and user experience (UX), and though they are related, they are not synonymous. The user interface is the point of contact between a person and a digital product — the screens, buttons, icons, typography, color schemes, and layouts that a user sees and manipulates. UI design focuses on the visual and interactive elements: making controls legible, arranging elements in a logical spatial hierarchy, and ensuring that the product looks polished and consistent.

User experience, by contrast, encompasses the entire journey a person has with a product, from first hearing about it to installing, learning, using, troubleshooting, and eventually abandoning it. UX includes emotional responses, expectations, the smoothness of onboarding, error recovery, and overall satisfaction. A product can have a visually stunning interface yet deliver a poor experience if the underlying workflow is confusing, the system is slow, or the information architecture forces users to hunt for features.

The relationship between the two is sometimes described with an analogy: UI is the saddle, stirrups, and reins, while UX is the entire feeling of riding the horse. Good UI is necessary for good UX, but it is not sufficient.

1.3 Good and Bad Design

Every designed artifact embodies assumptions about its users, and those assumptions can either support or obstruct the people who interact with it. Don Norman’s concept of a Norman door — a door whose design signals the wrong action, such as a flat plate that invites pushing when the door must be pulled — illustrates how even the simplest physical interfaces can fail when form contradicts function.

Good design, in Norman’s framework, rests on several pillars: discoverability (can users figure out what actions are possible?), feedback (does the system communicate the result of an action?), and conceptual models (does the user form an accurate mental picture of how the system works?). A well-designed product feels almost invisible; the user focuses on their task, not on the tool. A badly designed product forces the user to think about the tool itself, creating friction that ranges from minor annoyance to catastrophic error.

The study of HCI formalizes these intuitions into methods that can be applied systematically. Rather than relying on a designer’s taste alone, HCI provides structured processes for understanding users, generating design alternatives, building prototypes, and evaluating solutions with real people.

1.4 A Brief History of HCI

The history of HCI tracks the widening circle of who uses computers and for what purpose. In the 1940s and 1950s, interaction meant flipping physical switches and reading punched cards. The 1960s saw the advent of command-line interfaces, which required users to memorize arcane syntax but allowed more fluid dialogue with the machine. Doug Engelbart’s 1968 “Mother of All Demos” introduced the mouse, hypertext, video conferencing, and collaborative editing — ideas that would take decades to reach mainstream products.

The graphical user interface (GUI) emerged at Xerox PARC in the 1970s, was commercialized by Apple in 1984, and became ubiquitous with Microsoft Windows in the 1990s. GUIs shifted interaction from recall (typing commands from memory) to recognition (selecting from visible options), dramatically lowering the barrier to computer use.

The 2000s brought mobile and touch interfaces, catalyzed by the iPhone in 2007, which replaced the physical keyboard with a direct-manipulation touchscreen. The 2010s saw the rise of voice assistants (Siri, Alexa, Google Assistant), wearable computing (smartwatches, fitness trackers), and conversational AI. The 2020s have introduced generative AI interfaces — systems like ChatGPT, Copilot, and Midjourney that produce text, code, and images from natural-language prompts, raising new interaction design challenges around trust, control, and co-creation.


Chapter 2: Design Thinking and the Design Process

2.1 What Is Design Thinking?

Design thinking is a human-centered approach to innovation that integrates the needs of people, the possibilities of technology, and the requirements of business success. Tim Brown, CEO of IDEO, popularized the term in a 2008 Harvard Business Review article, describing it as a methodology that “uses the designer’s sensibility and methods to match people’s needs with what is technologically feasible and what a viable business strategy can convert into customer value.”

Design thinking is not a linear recipe but a set of overlapping spaces that teams revisit as understanding deepens. Brown identifies three spaces:

  1. Inspiration — the problem or opportunity that motivates the search for solutions. This involves empathizing with users and defining the core challenge.
  2. Ideation — the process of generating, developing, and testing ideas. This involves brainstorming, prototyping, and iterating.
  3. Implementation — the path from the project stage into people’s lives. This involves refining, producing, and deploying the solution.

The Stanford d.school articulates a similar five-stage model: Empathize, Define, Ideate, Prototype, Test. These stages are not sequential checkpoints but modes of activity that overlap, loop back, and repeat as the design team learns.

2.2 The Double Diamond

The British Design Council’s Double Diamond model visualizes the design process as two phases of divergent and convergent thinking. In the first diamond, the team discovers (diverging to explore the problem space broadly) and then defines (converging on a specific problem statement). In the second diamond, the team develops (diverging to generate many possible solutions) and then delivers (converging on a refined, tested solution).

The model captures an essential rhythm: opening up before narrowing down. Premature convergence — jumping to a solution before understanding the problem — is one of the most common design failures. The double diamond disciplines teams to invest in understanding before investing in building.

2.3 T-Shaped People and Interdisciplinary Teams

Brown advocates for T-shaped people: individuals who have deep expertise in one discipline (the vertical stroke of the T) and broad empathy and curiosity across many disciplines (the horizontal stroke). HCI teams typically combine skills in visual design, interaction design, user research, engineering, and domain expertise. The breadth of the T enables team members to understand each other’s contributions and collaborate effectively; the depth ensures that each contribution is rigorous.

2.4 Ethical Considerations in Design Research

Before engaging users in any research activity — interviews, observations, usability tests — designers must navigate ethical obligations. In Canada, the Tri-Council Policy Statement: Ethical Conduct for Research Involving Humans (TCPS 2) governs research ethics at universities. Key principles include:

  • Respect for persons: participants must give informed, voluntary consent and may withdraw at any time.
  • Concern for welfare: research must minimize risks and maximize benefits to participants.
  • Justice: the burdens and benefits of research should be distributed fairly.

In practice, this means obtaining ethics board approval, preparing consent forms, anonymizing data (using codes like P1, P2 rather than names), storing data securely, and being transparent about how findings will be used.


Chapter 3: Understanding Users

3.1 Why User Research Matters

The premise of human-centered design is that good products emerge from deep understanding of the people who will use them. Designers are not their users; what seems obvious or intuitive to someone immersed in a product’s internal logic may be baffling to an outsider. User research bridges this gap by grounding design decisions in evidence rather than assumption.

3.2 Personas

A persona is a fictional but data-informed character that represents a distinct segment of a product’s users. Pioneered by Alan Cooper in the late 1990s, personas distill research findings into a narrative form that design teams can empathize with and design for.

Cooper’s goal-directed design methodology distinguishes personas from demographic profiles. A persona is not a statistical average; it is an archetype defined primarily by goals — what the user is trying to accomplish — and behaviors — the patterns of action the user exhibits. A well-constructed persona has a name, a photograph (or illustration), a brief biography, a set of goals (both practical and emotional), and a description of the context in which they use the product.

Cooper identifies three types of goals that drive persona behavior:

  • Experience goals: how the user wants to feel (e.g., confident, in control, not stupid).
  • End goals: what the user wants to accomplish (e.g., manage all client projects in one place).
  • Life goals: who the user wants to be (e.g., a respected creative professional).

This hierarchy matters because experience goals are the most universal and stable — a frustrated user will abandon even a functionally correct tool.

Effective personas share several traits:

  • They are based on real research data (interviews, observations), not imagination.
  • They are specific enough to guide decisions. “Sarah, a 34-year-old freelance graphic designer who manages projects for 3–5 clients simultaneously” is more useful than “a professional user.”
  • They are distinct. If two personas would make the same design choices, they should be merged.
  • They include frustrations and pain points, not just aspirations.

A typical project creates three to five personas. One is designated the primary persona, whose needs take priority when design trade-offs arise. Secondary personas share most of the primary persona’s needs but have additional requirements. Negative personas — people the product is explicitly not designed for — can also be useful for bounding scope.

Cooper warns against the elastic user anti-pattern: using the vague phrase “the user” in design discussions, which allows each team member to project their own preferences and assumptions. Because “the user” can be stretched to justify any design decision, it produces incoherent products. Personas force specificity — when “Sarah” replaces “the user,” the team must confront concrete trade-offs.

A related insight from Cooper is the concept of perpetual intermediates. Most users never become experts; they pass through a brief beginner phase and then settle into a long intermediate plateau. Designing primarily for beginners produces hand-holding that annoys intermediates; designing for experts produces power tools that alienate everyone else. The sweet spot is designing for the perpetual intermediate while providing gentle onboarding and expert shortcuts.

3.3 Empathy Maps

An empathy map is a collaborative visualization that articulates what a team knows about a particular type of user. Developed at XPLANE and popularized by Dave Gray, the empathy map divides a user’s experience into four quadrants:

  1. Says — direct quotes or paraphrases from user interviews.
  2. Thinks — what the user might be thinking but not saying. Beliefs, concerns, aspirations.
  3. Does — observable actions and behaviors.
  4. Feels — emotional states — frustrated, confident, anxious, delighted.

The center of the map identifies the user (often a persona) and the task or situation being mapped. Some versions add two additional sections at the bottom: Pains (fears, frustrations, obstacles) and Gains (wants, needs, measures of success), borrowing from Osterwalder’s Value Proposition Canvas.

Empathy maps are most valuable when completed after user interviews, using real data rather than speculation. They serve as a shared reference point for the team, making abstract user needs tangible and visible on a wall or whiteboard.

3.4 Value Propositions

A value proposition articulates why a user should choose this product over alternatives. The Value Proposition Canvas, from Strategyzer, maps the relationship between what the user needs (customer profile: jobs, pains, gains) and what the product offers (value map: products/services, pain relievers, gain creators). A strong value proposition demonstrates clear fit between the two sides — the product’s features directly address the user’s most important pains and gains.

In a design studio setting, the value proposition is typically one of the first artifacts a team produces, because it forces the team to articulate the problem they are solving and for whom. It also provides a benchmark against which later design decisions can be evaluated: does this feature support the value proposition, or is it scope creep?


Chapter 4: User Research Methods

4.1 Interviews

User interviews are one of the most widely used methods in HCI research. They provide rich, qualitative data about users’ goals, behaviors, frustrations, and mental models. Interviews can be structured (fixed questions in a fixed order), semi-structured (a guide with room for follow-up), or unstructured (open conversation around a topic). Semi-structured interviews are the most common in design research because they balance consistency with flexibility.

Effective interview practice follows several guidelines:

  • Open-ended questions yield richer data than yes/no questions. “Tell me about a time when…” is more productive than “Do you like…?”
  • Follow-up probes — “Can you say more about that?” or “Why do you think that happened?” — uncover the reasoning behind surface-level answers.
  • Avoid leading questions that suggest a desired answer. “Don’t you find it frustrating when…?” biases the response.
  • Active listening means paraphrasing what the participant said to confirm understanding, maintaining eye contact, and allowing comfortable silences.

The logistics of interviews also matter. Sessions typically last 30–60 minutes. The interviewer should obtain informed consent, explain the purpose of the study, and assure confidentiality before beginning. Recording (audio or video) is valuable for later analysis but requires explicit permission. After the session, a thank-you message maintains goodwill for future contact.

Jakob Nielsen’s research suggests that five users are sufficient for qualitative usability testing, because the proportion of usability problems discovered follows a curve of diminishing returns: the first user reveals approximately one-third of all issues, and by the fifth user, roughly 85% of problems have surfaced. Beyond five, each additional participant reveals fewer new insights relative to the cost of recruiting and testing them. This finding applies to formative usability testing with a relatively homogeneous user group; larger samples are needed for quantitative benchmarking or when the user population is highly diverse.

4.2 Observations

Observation involves watching users in their natural environment as they perform tasks related to the product domain. Unlike interviews, which rely on self-report, observations capture actual behavior — including workarounds, errors, and habits that users may not think to mention.

Observation can be direct (the researcher is physically present, taking notes) or indirect (using video recordings, screen captures, or analytics). The researcher must decide on the degree of participation: a fly on the wall approach minimizes interference but may miss context, while participant observation (joining the activity) provides deeper understanding but risks influencing behavior.

The AEIOU framework, developed by Rick Robinson and colleagues at Doblin Group, provides a structured lens for observation:

  • Activities — goal-directed sets of actions people take.
  • Environments — the physical and social settings in which activities occur.
  • Interactions — exchanges between people, or between people and objects.
  • Objects — the tools, devices, and artifacts people use.
  • Users — the people themselves — their roles, relationships, values, and biases.

By systematically noting each dimension, researchers avoid the tunnel vision that comes from watching only the most obvious actions.

4.3 Questionnaires and Surveys

Questionnaires complement interviews by gathering data from larger samples at lower cost. They are most useful for measuring attitudes, preferences, demographics, and frequency of behaviors. Questionnaire design involves several considerations:

  • Question types: closed-ended (Likert scales, multiple choice, ranking) for quantitative analysis; open-ended for qualitative richness.
  • Question wording: avoid jargon, double-barreled questions (“How easy and enjoyable was the task?”), and loaded language.
  • Response scales: a 5-point or 7-point Likert scale is standard. Odd-numbered scales include a neutral midpoint; even-numbered scales force a directional choice.
  • Order effects: place easy, non-threatening questions first; group related items; randomize where appropriate.
  • Pilot testing: always test the questionnaire with a small group before deployment to catch ambiguous wording and technical glitches.

Standardized instruments exist for common HCI constructs. The System Usability Scale (SUS) is a 10-item Likert questionnaire that produces a single score from 0 to 100, with 68 as the approximate average. The NASA Task Load Index (NASA-TLX) measures subjective workload across six dimensions: mental demand, physical demand, temporal demand, performance, effort, and frustration.

4.4 Contextual Inquiry

Contextual inquiry, developed by Hugh Beyer and Karen Holtzblatt as part of their Contextual Design methodology, is a field research method structured as a master-apprentice relationship. The researcher visits the user’s workplace or environment and observes them performing real tasks, asking questions as they arise. The user is the “master” who demonstrates their expertise; the researcher is the “apprentice” who seeks to understand.

Contextual inquiry follows four principles:

  • Context: conduct the interview in the user’s actual environment, during actual work, not in a conference room.
  • Partnership: alternate between observing and asking questions, maintaining a collaborative rather than interrogative tone.
  • Interpretation: share your interpretations with the user in real time and ask them to correct or refine your understanding.
  • Focus: maintain a clear research focus to avoid being overwhelmed by the richness of the context.

The output of contextual inquiry feeds directly into affinity diagrams and work models, making it a cornerstone of the Contextual Design process.

4.5 Field Studies

Field studies examine technology use in real-world settings over extended periods. Unlike controlled lab experiments, field studies sacrifice experimental control for ecological validity — the findings reflect how products actually function in the complexity of daily life. Methods include diary studies (participants log their experiences at regular intervals), experience sampling (random prompts to capture in-the-moment states), and longitudinal deployment studies (installing a prototype and observing adoption over weeks or months).


Chapter 5: Synthesizing Research Data

5.1 From Data to Insight

Raw research data — interview transcripts, observation notes, survey responses — is voluminous and unstructured. Synthesis is the process of organizing, interpreting, and distilling this data into actionable insights that can guide design. The goal is not merely to summarize what participants said, but to identify patterns, contradictions, and unmet needs that point toward design opportunities.

5.2 Affinity Diagrams

An affinity diagram (also called an affinity map or KJ method, after its inventor Jiro Kawakita) is a bottom-up clustering technique for organizing qualitative data. The process works as follows:

  1. Extract data points: Write individual observations, quotes, or insights on sticky notes — one idea per note.
  2. Cluster by similarity: Without pre-defined categories, team members silently group notes that seem related. The process is spatial and intuitive, relying on pattern recognition rather than logical classification.
  3. Label clusters: Once groups stabilize, the team gives each cluster a descriptive label that captures its theme.
  4. Identify relationships: Arrange clusters spatially to reveal connections, hierarchies, and gaps.

Affinity diagrams work well with teams because they democratize analysis — every team member’s observations carry equal weight, and the silent clustering phase prevents dominant voices from dictating the structure. The output is a visual map of the research landscape that can be photographed, referenced throughout the project, and updated as new data arrives.

A common color-coding scheme uses yellow for observations, blue for insights or interpretations, red for pain points, and green for opportunities or ideas.

5.3 Card Sorting

Card sorting is a technique for understanding how users organize information. Participants are given a set of cards, each labeled with a content item (a page title, feature name, or concept), and asked to group them in a way that makes sense to them.

In an open card sort, participants create their own group labels, revealing their mental model of the domain. In a closed card sort, the researcher provides predefined categories and participants assign cards to them, testing whether a proposed information architecture matches user expectations. Hybrid approaches allow participants to both use predefined categories and create new ones.

Card sorting is most valuable during the early stages of information architecture design, when the team needs to decide how to organize navigation menus, site maps, or feature hierarchies. For qualitative insights, 15 or more participants are recommended; for statistical confidence in quantitative analysis, 30–50 participants are needed. Keeping the card count under 50 prevents participant fatigue. Card sorting is often paired with tree testing — a complementary method where users navigate a text-only hierarchy to find specific items, validating whether the organization discovered through card sorting actually works in practice.

5.4 Point of View and How Might We

After synthesis, the team formulates a Point of View (POV) statement that captures the design challenge in human terms. A POV follows the template:

[User] needs [need] because [insight].

For example: “A busy freelance designer needs a way to track deadlines across multiple clients because juggling separate tools leads to missed deliverables and client frustration.”

The POV reframes the problem around user needs rather than business goals or technical constraints. From the POV, the team generates How Might We (HMW) questions — open-ended prompts that invite creative solutions without prescribing them:

  • “How might we help freelancers visualize their commitments across all clients at a glance?”
  • “How might we reduce the cognitive load of switching between project management tools?”

HMW questions are deliberately broad enough to invite divergent thinking but specific enough to be actionable. They serve as launch pads for brainstorming sessions.


Chapter 6: Ideation and the Creative Process

6.1 Divergent Thinking

Ideation is the generation of a large quantity of diverse ideas that might address the design challenge. The emphasis is on quantity and variety, not quality — evaluation comes later. The principle of divergent thinking, articulated by psychologist J.P. Guilford, holds that creative solutions emerge from exploring many directions rather than pursuing a single path.

Several brainstorming techniques support divergent thinking:

  • Classic brainstorming: a group generates ideas freely, building on each other’s suggestions. Rules: defer judgment, encourage wild ideas, build on others’ ideas, go for quantity.
  • Brainwriting: each person writes ideas on paper and passes them to the next person, who builds on them. This avoids the social dynamics that can suppress quieter voices in verbal brainstorming.
  • Crazy 8s: each participant folds a sheet of paper into eight panels and sketches eight different ideas in eight minutes (one minute per panel). The time pressure forces rapid, unpolished thinking and prevents over-commitment to a single concept.
  • SCAMPER: a checklist of idea-modification strategies — Substitute, Combine, Adapt, Modify, Put to another use, Eliminate, Reverse.

6.2 Design Fixation

Design fixation is the tendency for designers to become attached to a particular solution — especially an early one — and to have difficulty generating alternatives. Jansson and Smith’s 1991 study demonstrated this experimentally: when given an example solution alongside a design brief, engineering students produced solutions that closely resembled the example, even when instructed to be creative. The example anchored their thinking.

Fixation is dangerous because it narrows the solution space prematurely. Strategies to mitigate it include:

  • Generating multiple concepts before evaluating any.
  • Deliberately seeking analogies from distant domains (how would a chef / a pilot / a child solve this?).
  • Critiquing example solutions explicitly before generating new ones.
  • Using structured ideation techniques (like Crazy 8s) that enforce variety.

Adam Grant’s research on “original thinkers” reinforces this point: the most creative people are not those who have the best first idea, but those who produce many ideas and select wisely. Quantity breeds quality because it forces exploration beyond the obvious.

6.3 User Stories and Scenarios

A user story is a short, plain-language description of a feature from the user’s perspective:

As a [type of user], I want to [action] so that [benefit].

User stories originated in agile software development but are widely used in HCI to keep design grounded in user needs. Each story captures a unit of value, and the collection of stories for a product defines its feature scope.

An epic is a large user story that can be broken into smaller stories. For example, the epic “As a freelancer, I want to manage all my projects in one place” might decompose into stories about creating projects, setting deadlines, tracking time, and invoicing clients.

A user scenario elaborates a user story into a narrative that describes the context, motivation, and sequence of actions:

Sarah is working from a coffee shop on a Tuesday afternoon. She has three client deliverables due this week and realizes she hasn’t checked whether the Acme Corp logo revisions were approved. She opens the app, taps the Acme project, and sees a notification that the client approved the revisions yesterday…

Scenarios make abstract requirements concrete, helping designers and stakeholders envision how the product fits into real life.

6.4 Storyboards

A storyboard is a comic-strip-style sequence of sketches that illustrates a user’s journey through a scenario. Each frame shows a key moment: the setting, the user’s action, the system’s response, and the user’s emotional state. Storyboards are valuable because they communicate design intent quickly to diverse audiences — stakeholders who might not read a requirements document can understand a storyboard at a glance.

Effective storyboards include three elements:

  • Setting: where and when the interaction takes place, establishing context.
  • Sequence: the step-by-step progression of actions and responses.
  • Satisfaction: the resolution, showing how the product addressed the user’s need.

Chapter 7: Design Principles and Affordances

7.1 Norman’s Design Principles

Don Norman’s The Design of Everyday Things (originally published in 1988 as The Psychology of Everyday Things) articulates a set of principles that remain foundational to interaction design:

Affordance: The relationship between the properties of an object and the capabilities of the agent that determine how the object could possibly be used. A chair affords sitting; a button affords pressing. Affordances exist whether or not the user perceives them.
Signifier: A perceivable cue that communicates where an action should take place and what will happen. While affordances are relationships, signifiers are the signals that make affordances discoverable. A "Push" label on a door is a signifier; the door's hinge placement (enabling pushing) is the affordance.
Mapping: The relationship between controls and their effects. Natural mappings exploit spatial analogies: a row of light switches arranged to mirror the spatial layout of the lights they control is easier to use than an arbitrary arrangement.
Feedback: Communicating the results of an action back to the user. Feedback must be immediate, informative, and not overwhelming. A button that visually depresses when clicked, a progress bar during a file upload, and a confirmation sound when a message sends are all forms of feedback.
Constraints: Limiting the possible actions to prevent errors. Physical constraints (a USB-C plug fits only one way), logical constraints (graying out unavailable menu items), and cultural constraints (red means stop/danger) all guide users toward correct behavior.
Conceptual Model: The user's understanding of how a system works. A good conceptual model allows users to predict the effects of their actions. Desktop interfaces exploit the conceptual model of a physical desk — files, folders, a trash can — to make digital operations intuitive.

7.2 The Affordance Debate

Norman’s use of the term “affordance” has a complex history. The concept originates with ecological psychologist James J. Gibson, who defined affordances as action possibilities that exist in the environment relative to an animal’s capabilities — independent of perception. A cliff affords falling whether or not the animal perceives the danger.

When Norman introduced the concept to design in 1988, he emphasized perceived affordances — what users think they can do with an object. A flat glass panel on a touchscreen has no physical affordance for pressing, but its visual design (a raised-looking button graphic) creates a perceived affordance. This shift from Gibson’s objective affordances to Norman’s perceived affordances caused confusion, because designers began using “affordance” to mean “visual clue,” which is not what Gibson intended.

In a 1999 correction, Norman acknowledged the confusion and proposed the term signifier to describe what designers actually control: the perceptible signals that indicate where to act. In his 2008 essay “Signifiers, Not Affordances,” he went further, arguing that the design community should focus on signifiers rather than affordances, because signifiers are what designers deliberately craft, while affordances are properties of the world that exist regardless of design intent.

In the 2008 essay, Norman also introduces the concept of social signifiers — cues that arise not from deliberate design but from the behavior of other people. A worn path across a lawn is a social signifier indicating where people actually walk, regardless of where the architect placed the sidewalk. A crowd gathering at one end of a train platform signals something — perhaps that’s where the exit is. Digital examples include view counts, “most popular” badges, and the fact that a GitHub repository has thousands of stars. Social signifiers are powerful because they carry implicit social proof, but they can also mislead when the crowd is wrong.

Joanna McGrenere and Wayne Ho’s 2000 paper “Affordances: Clarifying and Evolving a Concept” traced this evolution systematically. They argued that Gibson’s and Norman’s definitions serve different purposes: Gibson’s is useful for understanding the relationship between agents and environments, while Norman’s (or rather, the concept of signifiers) is useful for guiding design practice. The two are complementary, not contradictory, as long as the terminology is used precisely.

7.3 Norman’s Gulf Model

Norman’s seven-stage model of action describes the cognitive cycle a user goes through when interacting with any system:

  1. Goal — forming an intention (what do I want?).
  2. Plan — deciding on an action strategy.
  3. Specify — translating the plan into a specific action sequence.
  4. Perform — executing the action.
  5. Perceive — observing the result.
  6. Interpret — making sense of the observation.
  7. Compare — evaluating whether the goal was achieved.

This cycle reveals two critical gaps where design can fail:

The Gulf of Execution is the gap between what the user wants to do and what the system allows them to do. A wide gulf means the user cannot figure out how to act — the controls are hidden, the labels are cryptic, or the action sequence is unintuitive. Good signifiers, natural mappings, and constraints narrow this gulf.

The Gulf of Evaluation is the gap between the system’s actual state and the user’s ability to perceive and interpret that state. A wide gulf means the user cannot tell what happened — did the action succeed? What changed? Good feedback, visible system status, and clear conceptual models narrow this gulf.

The two gulfs map directly to Nielsen’s heuristics: “visibility of system status” and “match between system and real world” address the Gulf of Evaluation, while “user control and freedom” and “recognition rather than recall” address the Gulf of Execution.

7.4 Gestalt Principles in Interface Design

The Gestalt principles of perceptual organization, developed by German psychologists in the early twentieth century, describe how humans naturally group visual elements. These principles are directly applicable to interface layout:

  • Proximity: elements placed close together are perceived as a group. In a form, labels positioned near their input fields are understood as belonging together.
  • Similarity: elements that share visual attributes (color, shape, size) are perceived as related. Consistent button styling signals that all buttons behave similarly.
  • Continuity: the eye follows smooth paths. Aligned elements (text in columns, items along a grid) feel organized even without explicit borders.
  • Closure: the mind completes incomplete shapes. A progress indicator showing 75% completion is perceived as a nearly-full whole rather than a partial arc.
  • Figure/Ground: the eye distinguishes foreground objects from background. Modal dialogs exploit this by dimming the background to draw attention to the foreground panel.
  • Common Region: elements enclosed within a boundary are perceived as a group. Card-based layouts (as on Pinterest or Trello) use this principle to separate distinct items.
  • Common Fate: elements that move together are perceived as a group. Animations that slide related elements in unison reinforce their connection.

7.5 Quantitative Laws: Hick’s Law and Fitts’s Law

Two quantitative laws from experimental psychology provide predictive power in interface design:

Hick’s Law (also Hick-Hyman Law) states that the time to make a decision increases logarithmically with the number of choices:

\[ T = a + b \cdot \log_2(n + 1) \]

where \( n \) is the number of equally probable alternatives. The design implication is that menus and option sets should be structured to minimize the number of choices presented at each level. Hierarchical menus, progressive disclosure, and sensible defaults all reduce decision time.

Fitts’s Law states that the time to move a pointer to a target is a function of the distance to the target and the target’s size:

\[ T = a + b \cdot \log_2\left(\frac{D}{W} + 1\right) \]

where \( D \) is the distance to the target and \( W \) is the target width. The design implication is clear: make frequently used targets large and position them close to where the cursor is likely to be. This explains why mobile interfaces use large touch targets (Apple recommends at least 44 × 44 points) and why menus attached to screen edges (which have effectively infinite width in one dimension) are fast to access.

7.6 GOMS and the Model Human Processor

GOMS (Goals, Operators, Methods, Selection rules) is a family of models for predicting how long skilled users will take to complete tasks with a given interface. Developed by Card, Moran, and Newell in 1983, GOMS decomposes any task into:

  • Goals: what the user wants to accomplish (e.g., “delete a word”).
  • Operators: the elementary actions required (e.g., move cursor, select text, press Delete).
  • Methods: sequences of operators that achieve a goal (e.g., double-click to select word, then press Delete).
  • Selection rules: criteria for choosing between alternative methods (e.g., “if text is short, use backspace; if text is long, use select-all and delete”).

The Keystroke-Level Model (KLM), the simplest GOMS variant, assigns fixed time estimates to basic operators: keypress (~0.2s), pointing (~1.1s), mental preparation (~1.35s), and homing between keyboard and mouse (~0.4s). By summing these estimates, a designer can predict task completion time before building any prototype.

Underlying GOMS is the Model Human Processor (MHP), which models human cognition as three interacting subsystems, each with a characteristic cycle time: the perceptual processor (~100ms), the cognitive processor (~70ms), and the motor processor (~70ms). These cycle times explain why users cannot respond to events faster than about 200–300ms and why animations shorter than ~100ms are perceived as instantaneous.

GOMS is most useful for comparing interface alternatives for well-defined, routine tasks. It is less useful for creative, exploratory, or error-prone tasks where skilled performance is not the primary concern.


Chapter 8: Information Architecture and Interaction Design

8.1 Information Architecture

Information architecture (IA) is the structural design of shared information environments — the organization, labeling, navigation, and search systems that help people find and understand information. Richard Saul Wurman coined the term in 1975, and Peter Morville and Louis Rosenfeld’s Information Architecture for the Web and Beyond is the field’s standard reference.

IA operates at a level above visual design: before deciding what a page looks like, the architect must decide what pages exist, how they relate to each other, and how users move between them. The core components of IA are:

  • Organization systems: how information is structured — alphabetical, chronological, topical, task-based, or audience-based.
  • Labeling systems: how information is represented — the names of categories, links, and headings.
  • Navigation systems: how users move through information — global navigation, local navigation, breadcrumbs, site maps.
  • Search systems: how users look for information when browsing fails — search bars, filters, faceted search.

8.2 User Flows

A user flow is a diagram that maps the path a user takes through a product to accomplish a specific task. Each node represents a screen or decision point, and each edge represents a transition triggered by a user action (tap, swipe, submit) or system event (notification, error). User flows help designers ensure that every task has a clear, efficient path and that no dead ends exist.

User flows are distinct from wireflows, a hybrid deliverable described by Page Laubheimer at Nielsen Norman Group. Wireflows combine the screen layouts of wireframes with the sequential logic of flowcharts, showing what each screen looks like alongside how users move between them. Each step displays a screen mockup, and arrows originate from specific UI elements (hotspots) — not from the screen as a whole — pointing to the resulting screen state. This specificity reduces ambiguity about which element triggers which transition.

Wireflows are particularly valuable for mobile apps and dynamic web applications where screens change content based on interaction (AJAX updates, filter results, modal dialogs). For desktop interfaces, Laubheimer recommends showing only the changed portion of the screen rather than redrawing the full page. Wireflows are less useful for static multi-page websites where standard page-to-page navigation is straightforward.

8.3 Navigation Patterns

Mobile interfaces have converged on several standard navigation patterns:

  • Tab bar (bottom navigation): a persistent row of 3–5 icons at the bottom of the screen, providing one-tap access to top-level sections. Suitable when the app has a small number of equally important destinations.
  • Hamburger menu (drawer navigation): a hidden menu accessed by tapping a three-line icon, typically in the top-left corner. Saves screen space but reduces discoverability — users may not explore features they cannot see.
  • Hub-and-spoke: a home screen (hub) with links to independent sections (spokes). Users always return to the hub before navigating to another section. Common in utility apps and settings screens.
  • Full-screen navigation: the entire screen serves as a navigation menu before content is displayed. Used in apps with a small number of distinct modes (e.g., a music app showing library, search, and radio as full-screen options).

Research consistently finds that visible, persistent navigation outperforms hidden navigation in terms of discoverability and task completion. The hamburger menu, while space-efficient, should be used with caution: items buried behind it receive significantly fewer clicks than items visible in a tab bar.


Chapter 9: Prototyping

9.1 Why Prototype?

A prototype is a tangible representation of a design idea, created to explore, communicate, and test concepts before committing to full implementation. Prototyping is central to design thinking because it transforms abstract ideas into something people can interact with, critique, and improve.

The mantra “fail early and fail cheaply” captures the economic logic of prototyping. Discovering a fundamental flaw in a paper sketch costs minutes; discovering the same flaw in a shipped product costs months and millions. Prototypes are expendable by design — their purpose is to generate learning, not to become the final product.

9.2 Fidelity Spectrum

Prototypes exist on a spectrum of fidelity — how closely they resemble the final product in terms of visual design, interactivity, and content:

Low-fidelity prototypes are rough, quick, and inexpensive. They include:

  • Paper prototypes: hand-drawn screens on paper or index cards. A team member plays the “human computer,” swapping pages in response to the user’s simulated taps. Paper prototypes are excellent for testing navigation flow, screen layout, and task structure without any software tools.
  • Wireframes: simplified digital layouts showing the placement of elements (text blocks, images, buttons) without color, typography, or real content. Wireframes communicate structure and hierarchy.

Medium-fidelity prototypes add some interactive behavior:

  • Clickable wireframes: wireframe screens linked together so that tapping a button transitions to the next screen. Tools like Figma, Balsamiq, and Axure support this.
  • Wizard of Oz prototypes: a human behind the scenes simulates system intelligence (e.g., typing responses in a “chatbot” demo while the user believes they are talking to AI).

High-fidelity prototypes closely resemble the final product:

  • Interactive mockups: pixel-perfect screens with realistic content, full color, typography, and animations, linked into a clickable flow. Figma, Sketch, and Adobe XD are common tools.
  • Functional prototypes: coded implementations with real (or realistic) data and behavior, used for late-stage testing.

The appropriate fidelity depends on the question being asked. Early-stage questions (“Is the navigation structure intuitive?”) are best answered with low-fidelity prototypes, because high-fidelity polish distracts participants from structural issues and discourages honest criticism (people are reluctant to criticize something that looks finished). Late-stage questions (“Does the animation timing feel right?”) require high-fidelity prototypes.

9.3 The Sketch-Prototype Continuum

Bill Buxton, in Sketching User Experiences, argues that designers should distinguish between sketches and prototypes based on their purpose, not their fidelity. A sketch is quick, disposable, ambiguous, and suggestive — it invites exploration and says “what if?” A prototype is more refined, specific, and testable — it asks “does this work?” The two occupy a continuum, and the designer’s job is to match the artifact’s fidelity to the current level of uncertainty.

Buxton’s central maxim — “get the right design before getting the design right” — captures this philosophy. Early in a project, when uncertainty is high, the team should produce many sketches to explore the problem space broadly. Investing in high-fidelity prototypes too early is wasteful because it anchors the team to a specific solution before the problem is well understood. As the design converges, fidelity should increase progressively, with each jump in fidelity driven by specific questions that lower-fidelity artifacts cannot answer.

This continuum also explains the role of Wizard of Oz prototyping, where a human operator behind the scenes simulates system intelligence while the user believes they are interacting with a real system. A Wizard of Oz prototype has the external fidelity of a working system but requires no actual implementation. It is particularly useful for testing AI-powered features, voice interfaces, and recommendation systems before committing to the engineering effort of building them.

9.4 Parallel Prototyping

Steven Dow and colleagues’ CHI 2010 study demonstrated experimentally that parallel prototyping — creating and testing multiple design alternatives simultaneously — produces better outcomes than serial prototyping (iterating on a single design). In their study, participants designed web advertisements under identical time constraints. The parallel group created multiple different designs simultaneously before receiving feedback, while the serial group iterated on one design at a time. Expert judges rated the parallel group’s final outputs significantly higher, and the designs received higher click-through rates. Crucially, parallel prototypers also reported greater self-efficacy — they felt more confident in their design abilities — while serial prototypers became emotionally attached to their single concept and were more defensive about criticism.

The mechanism is related to design fixation: serial iteration tends to converge on incremental improvements to the first idea, while parallel exploration forces the designer to consider fundamentally different approaches. Even under time pressure, parallel prototyping outperforms serial iteration, because the cost of creating an additional rough prototype is low and the benefit of exploring the design space is high.

Tohidi and colleagues’ CHI 2006 study “Getting the Right Design and the Design Right” reinforced this finding from the evaluation side: showing users multiple design alternatives in a test session elicits richer, more critical, and more actionable feedback than showing a single design. When presented with one prototype, participants tended to be polite and uncritical; when presented with several, they engaged in comparative analysis and were more comfortable expressing preferences. The implication is that even rough alternatives are more valuable for evaluation than a single polished prototype.

9.5 Design Systems

A design system is a collection of reusable components and standards that ensure visual and behavioral consistency across a product. A design system typically includes:

  • Color palette: primary, secondary, and neutral colors with defined usage rules.
  • Typography: font families, sizes, weights, and line heights for headings, body text, captions, and buttons.
  • Grid and spacing: a layout grid (e.g., 8-point grid) that governs margins, padding, and alignment.
  • Components: buttons, input fields, cards, navigation bars, modals, and other building blocks, each with defined states (default, hover, active, disabled, error).
  • Icons and imagery: a consistent icon set and guidelines for photography or illustration style.

Design systems scale design quality: once established, any team member can assemble new screens that look and behave consistently without consulting the original designer. Major examples include Google’s Material Design, Apple’s Human Interface Guidelines, and IBM’s Carbon Design System.


Chapter 10: Usability Evaluation Methods

10.1 Overview

Usability evaluation assesses how well a product supports users in achieving their goals effectively, efficiently, and satisfactorily. Evaluation methods fall into two broad categories:

  • Inspection methods: experts examine the interface against established principles (no users required). These include heuristic evaluation and cognitive walkthrough.
  • User testing methods: real users attempt tasks while researchers observe. These include think-aloud testing, A/B testing, and remote usability testing.

Inspection methods are faster and cheaper; user testing methods reveal problems that experts miss (and vice versa). A thorough evaluation strategy employs both.

10.2 Heuristic Evaluation

Heuristic evaluation, developed by Jakob Nielsen and Rolf Molich in 1990, involves having a small number of evaluators independently examine an interface for violations of recognized usability principles (heuristics). Nielsen’s ten heuristics are:

  1. Visibility of system status: the system keeps users informed about what is happening through timely feedback.
  2. Match between system and the real world: the system uses language, concepts, and conventions familiar to the user rather than system-oriented jargon.
  3. User control and freedom: users need clearly marked emergency exits (undo, redo, cancel) to recover from mistakes.
  4. Consistency and standards: the same words, actions, and situations should mean the same things throughout.
  5. Error prevention: even better than good error messages is preventing errors from occurring in the first place through constraints and confirmations.
  6. Recognition rather than recall: make objects, actions, and options visible so users do not have to remember information from one step to another.
  7. Flexibility and efficiency of use: shortcuts and customization allow experienced users to work faster without burdening novices.
  8. Aesthetic and minimalist design: interfaces should not contain irrelevant or rarely needed information; every extra element competes with relevant information.
  9. Help users recognize, diagnose, and recover from errors: error messages should be expressed in plain language, precisely indicate the problem, and suggest a solution.
  10. Help and documentation: while it is better if the system is usable without documentation, it may be necessary to provide help that is easy to search, focused on tasks, and concise.

The evaluation process works as follows: 3–5 evaluators independently walk through the interface, noting each heuristic violation they find. They rate each violation on a severity scale from 0 (not a usability problem) to 4 (usability catastrophe). After independent review, evaluators compare findings. Using multiple evaluators is essential because no single evaluator finds all problems — Nielsen’s research shows that one evaluator typically finds only 35% of usability issues, while five evaluators collectively find approximately 75%.

10.3 Cognitive Walkthrough

A cognitive walkthrough evaluates the learnability of an interface by simulating a new user’s thought process as they attempt a task for the first time. Unlike heuristic evaluation, which examines the entire interface broadly, cognitive walkthrough focuses on specific task flows step by step.

For each action in a task sequence, the evaluator asks four questions:

  1. Will the user try to achieve the right effect? — Does the user understand what they need to do at this step?
  2. Will the user notice that the correct action is available? — Is the right control visible and recognizable?
  3. Will the user associate the correct action with the desired effect? — Does the label or icon clearly communicate what will happen?
  4. If the correct action is performed, will the user see that progress is being made? — Does the system provide appropriate feedback?

A “no” answer to any question identifies a potential learnability problem. Cognitive walkthroughs are particularly valuable for evaluating walk-up-and-use systems (kiosks, consumer apps) where users are unlikely to read manuals and must figure things out by exploration.

10.4 Usability Testing

Usability testing involves observing real users as they attempt to complete tasks using the product (or prototype). The standard setup includes:

  • Facilitator: guides the session, presents tasks, and asks follow-up questions without leading the participant.
  • Participant: a representative user who has not been involved in the design process.
  • Observer(s): team members who watch and take notes, either in the same room or via screen sharing.

Task scenarios are the backbone of usability testing. A good task scenario is realistic, specific, and does not reveal the solution:

  • Bad: “Find the Settings page.” (tells the user what to click)
  • Good: “You want to change your notification preferences. How would you do that?” (describes a goal without prescribing the path)

During the session, the think-aloud protocol asks participants to verbalize their thoughts as they work: “I’m looking for something that says notifications… I see this gear icon, maybe that’s settings… I’ll tap that.” Think-aloud data reveals not just what users do but why they do it, exposing confusions, expectations, and mental models that observation alone would miss.

After testing, the team compiles findings, prioritizes issues by severity and frequency, and iterates on the design. Even with paper prototypes, usability testing yields valuable insights. The Mozilla case study, documented by Susan Farrell at Nielsen Norman Group, illustrates this vividly. Designer Crystal Beasley was tasked with redesigning the Firefox Support homepage, which had devolved into a wall of 30+ confusing links that drove frustrated users to the forums. Rather than coding alternatives, Beasley printed OmniGraffle mockups on tabloid paper and ran seven design iterations in just two weeks with real Firefox users. The paper testing revealed that users loved large icons but found certain labels confusing, and that mixing high-fidelity elements (polished icons) with low-fidelity backgrounds created visual hierarchy problems that distracted users from the navigation structure being tested. The key lesson: frequent small iterations beat infrequent large overhauls, and consistent fidelity across a prototype prevents artifacts from distorting the findings.

10.5 Experimental and Survey Methods in HCI Research

Beyond formative evaluation, HCI researchers employ rigorous experimental and survey methodologies for summative evaluation and academic research.

Experimental research in HCI typically involves manipulating one or more independent variables (e.g., interface design, input method, feedback type) and measuring their effect on dependent variables (e.g., task completion time, error rate, user satisfaction). Experiments can use within-subjects designs (every participant experiences all conditions, controlling for individual differences but introducing order effects) or between-subjects designs (each participant experiences only one condition, avoiding order effects but requiring more participants to achieve statistical power).

Survey research in HCI gathers self-reported data about user attitudes, experiences, and demographics at scale. Müller, Sedley, and Ferrall-Nunge describe survey response as a four-step cognitive process, following Tourangeau’s model: (1) comprehend the question, (2) retrieve relevant information from memory, (3) judge the retrieved information for relevance and accuracy, and (4) map the judgment onto the response scale. Poorly worded questions break at step 1; questions about infrequent events break at step 2; social desirability bias enters at step 3; ambiguous scales break at step 4.

For populations larger than 20,000, a sample of approximately 384 respondents achieves 95% confidence with a 5% margin of error. However, survey research is strongest when paired with qualitative methods: surveys quantify the “what” and “how much,” while interviews and observations explain the “why.” Müller and colleagues emphasize that HCI surveys should be validated (ensuring questions measure what they claim to measure) and reliable (producing consistent results across administrations). Probability sampling — where every member of the population has a known chance of selection — remains the gold standard, though convenience samples are common in practice.


Chapter 11: Visual Design Principles

11.1 The Role of Visual Design in HCI

Visual design is not decoration — it is a communication system. Every visual choice (color, typography, spacing, alignment) conveys meaning, establishes hierarchy, and guides the user’s eye. A visually well-designed interface reduces cognitive load by making structure and relationships immediately apparent; a visually cluttered or inconsistent interface forces users to work harder to parse information.

11.2 Typographic Hierarchy

Typography is the most information-dense visual channel in most interfaces, because the majority of content is text. Establishing a clear typographic hierarchy — visually distinguishing headings, subheadings, body text, captions, and labels — allows users to scan a page and grasp its structure at a glance.

Key principles of typographic hierarchy:

  • Size contrast: larger text draws the eye first. A heading should be noticeably larger than body text (a ratio of at least 1.5× is common).
  • Weight contrast: bold text stands out from regular weight.
  • Color contrast: darker or more saturated text against a lighter background signals importance. Gray text signals secondary content.
  • Spacing: generous whitespace above a heading signals the start of a new section. Tighter spacing between a heading and its body text signals that they belong together (proximity principle).

11.3 Color

Color serves multiple functions in interface design: establishing brand identity, creating visual hierarchy, communicating state (error, success, warning), and grouping related elements. Effective color use follows several guidelines:

  • Start with a neutral palette (grays) for the majority of the interface, and reserve color for elements that need to stand out: primary actions, links, alerts, and data visualizations.
  • Use one primary color consistently for interactive elements (buttons, links, toggles). This teaches users that “blue means clickable” (or whatever the primary color is).
  • Ensure sufficient contrast between text and background. The Web Content Accessibility Guidelines (WCAG) require a contrast ratio of at least 4.5:1 for normal text and 3:1 for large text.
  • Do not rely on color alone to convey information. Approximately 8% of men have some form of color vision deficiency. Use icons, labels, or patterns as redundant signals.

11.4 Layout and the Grid

A grid system provides a structural framework for arranging elements on a page. Grids enforce alignment and consistency, making layouts feel orderly and professional. The 8-point grid (all spacing and sizing values are multiples of 8 pixels) is widely used in digital design because it produces clean layouts that scale well across screen densities.

Grid-based layouts exploit the Gestalt principle of continuity: elements aligned along the same axis are perceived as related. Breaking the grid deliberately — making one element larger, misaligned, or differently colored — draws attention to that element, creating visual emphasis.

11.5 Erik Kennedy’s Rules for UI Design

Erik Kennedy’s “7 Rules for Creating Gorgeous UI” distills visual design into practical guidelines for developers and designers without formal design training:

  1. Light comes from the sky: shadows fall downward, and elements that appear raised have lighter tops and darker bottoms. Inset elements (text inputs, well panels) should be darker at the top; raised elements (buttons, cards) should be lighter at the top. Consistent virtual lighting makes interfaces feel physically plausible.
  2. Black and white first: design in grayscale before adding color. This forces you to establish hierarchy through spacing, size, and contrast rather than relying on color as a crutch. When adding color, use the HSB (hue, saturation, brightness) model and introduce one accent color thoughtfully.
  3. Double your whitespace: most novice designs use too little whitespace. Generous padding between elements signals polish and professionalism. Whitespace is the default, not filler — when in doubt, add more.
  4. Learn the methods of overlaying text on images: when placing text over photographs, use techniques like translucent overlays, text-in-a-box, background blur, floor fade gradients, or scrim gradients to ensure legibility.
  5. Make text pop and un-pop: combine “up-pop” techniques (larger size, bold weight, higher contrast) with “down-pop” techniques (smaller size, lighter weight, lower contrast) to create hierarchy. Only page titles should go full up-pop; everything else should be calibrated relative to its importance.
  6. Use only good fonts: stick to clean, well-crafted typefaces — system fonts (San Francisco, Segoe UI, Roboto) or curated selections from Google Fonts and FontShare. A small number of well-paired typefaces is better than many fonts used inconsistently.
  7. Steal like an artist: study interfaces on Dribbble, Mobbin, and Layers; analyze what makes them work; and adapt those patterns to your own designs. Originality in UI design is less important than appropriateness and consistency.

Chapter 12: Collaborative and Ubiquitous Computing

12.1 Computer-Supported Cooperative Work

Computer-Supported Cooperative Work (CSCW) studies how people work together with the support of computing technology. The field examines groupware, workflow systems, social media platforms, and any technology that mediates collaboration.

A foundational framework in CSCW is the time-space matrix, which classifies collaborative technologies along two dimensions:

Same placeDifferent place
Same timeFace-to-face interaction (shared displays, electronic whiteboards)Synchronous distributed (video conferencing, shared editors)
Different timeAsynchronous co-located (shared physical bulletin boards, shift handoff logs)Asynchronous distributed (email, wikis, version control)

Each quadrant poses different design challenges. Synchronous same-place tools must support awareness — the ability to monitor what collaborators are doing — without disrupting individual work. Asynchronous distributed tools must handle coordination — managing dependencies between tasks — and communication — conveying context that is lost without real-time interaction.

Key CSCW concepts include:

  • Awareness: knowing who is present, what they are doing, and what they have done. Shared cursors in Google Docs and presence indicators in Slack are awareness mechanisms. Research in CSCW has shifted from surveillance-style activity monitoring toward reciprocal, consensual awareness — systems that let collaborators make themselves visible rather than being tracked.
  • Common ground: the shared knowledge, beliefs, and assumptions that enable effective communication. Technology-mediated communication often disrupts common ground by removing nonverbal cues, leading to misunderstandings.
  • Social translucence: Erickson and Kellogg’s principle that collaborative systems should make social information visible (who is present, who did what) to enable social norms and accountability.

A critical finding from CSCW ethnographic research is Lucy Suchman’s work at Xerox PARC, which showed that formal workflow systems often fail because real group work is largely exception-handling and improvisation, not the rational sequential processes that workflow models assume. Suchman’s concept of situated action holds that plans are resources for action, not deterministic scripts — people adapt moment-by-moment to the evolving situation rather than following pre-determined sequences. This challenged the dominant rationalist paradigm in system design and remains influential in how HCI researchers think about the gap between designed workflows and actual practice.

Two related theoretical frameworks from CSCW enrich the understanding of collaborative systems:

  • Distributed cognition, developed by Edwin Hutchins, argues that cognition spans people, artifacts, and environments — not just individual minds. In Hutchins’ study of ship navigation, no single crew member held enough information to navigate the ship; cognition was distributed across instruments, charts, and the social coordination of the crew. For CSCW design, this means that groupware should support the entire cognitive system, not just individual users.
  • Activity theory models human activity as mediated by tools within a social and cultural context. Rather than reducing interaction to stimulus-response pairs, activity theory examines the hierarchy of activities (driven by motives), actions (driven by goals), and operations (driven by conditions). This richer framework helps designers understand why users behave in unexpected ways — their motives and cultural context shape their interaction with tools.

12.2 Ubiquitous Computing

Ubiquitous computing (ubicomp), a vision articulated by Mark Weiser at Xerox PARC in 1991, imagines a world where computing is seamlessly integrated into everyday environments rather than confined to dedicated devices. Weiser’s famous dictum — “The most profound technologies are those that disappear” — envisions computation woven into walls, furniture, clothing, and infrastructure, serving people without demanding attention.

Bill Buxton’s work on the “emerging digital eco-system” extends this vision to multi-device interaction. Buxton uses a heating analogy: early computing was like having a dedicated furnace room — you went to the computer to compute. Ubiquitous computing should be like modern climate control — the service (heating, cooling) is everywhere, but the delivery mechanism is invisible and responds to context. Rather than designing each device in isolation, Buxton argues for designing the ecosystem — the collection of devices, services, and environments that a person moves through during a day. A user might start a task on a phone, continue on a laptop, and finish on a large display; the system should support this fluid movement without forcing the user to manage synchronization manually.

Buxton introduces the term “whereable” computing (as opposed to “wearable”) to emphasize that ubiquity is fundamentally about where and when technology is available, not just that it is available. A camera on a desk serves collaboration; a camera by a door serves check-ins; the same technology in different places serves different purposes. The accumulated complexity of many individually simple devices still overwhelms users — what Buxton calls the “society of appliances” — and the design challenge is ensuring seamless machine-to-machine coordination so that users experience transparency rather than friction.

Design challenges in ubiquitous computing include:

  • Context awareness: systems must sense and respond to the user’s situation (location, activity, social context) without requiring explicit input.
  • Calm technology: Weiser and Brown’s concept of technology that informs without demanding attention — moving between the center and periphery of awareness as needed.
  • Privacy: pervasive sensing raises concerns about surveillance and data collection. Designing for privacy in ubicomp requires transparency, user control, and data minimization.

Chapter 13: Accessible and Inclusive Design

13.1 Universal Design

Universal design is the design of products and environments that are usable by all people, to the greatest extent possible, without the need for adaptation or specialized design. The seven principles of universal design, developed by Ronald Mace and colleagues at North Carolina State University, include equitable use, flexibility in use, simple and intuitive use, perceptible information, tolerance for error, low physical effort, and appropriate size and space for approach and use.

In digital contexts, universal design manifests as accessibility — ensuring that people with disabilities can perceive, understand, navigate, and interact with digital products. The Web Content Accessibility Guidelines (WCAG), maintained by the W3C, provide a comprehensive set of criteria organized under four principles (POUR):

  • Perceivable: information and UI components must be presentable in ways that users can perceive (text alternatives for images, captions for video, sufficient contrast).
  • Operable: UI components and navigation must be operable (keyboard accessibility, sufficient time, no seizure-inducing content).
  • Understandable: information and UI operation must be understandable (readable text, predictable behavior, input assistance).
  • Robust: content must be robust enough to be interpreted by assistive technologies (valid markup, ARIA attributes).

13.2 Designing for Disability as Innovation

Haben Girma, the first deafblind graduate of Harvard Law School, argues that designing for disability drives innovation that benefits everyone. Curb cuts, originally designed for wheelchair users, are used by parents with strollers, travelers with rolling luggage, and delivery workers with hand trucks. Similarly, closed captions, designed for deaf viewers, are used in noisy gyms, quiet libraries, and by language learners.

This principle — that accessibility constraints breed creative solutions with broad applicability — is a recurring theme in HCI. Voice interfaces, developed partly for users with motor disabilities, became mainstream with Siri and Alexa. Autocomplete, developed to reduce typing for users with limited dexterity, saves time for everyone.

The implication for design practice is that accessibility should not be treated as a compliance checklist applied after the “real” design is done, but as a generative constraint that improves design from the outset. Including people with diverse abilities in user research reveals needs and opportunities that able-bodied designers overlook.


Chapter 14: Human-AI Interaction

14.1 The Rise of AI-Generated Content

The Spring 2024 offering of CS 449 focused its project theme on interacting with AIGC (Artificial Intelligence Generated Content) — systems like ChatGPT, GitHub Copilot, DALL-E, and Midjourney that generate text, code, and images from natural-language prompts. These systems have created a new category of interaction design challenges, because the user is no longer merely consuming or manipulating content but co-creating it with an AI partner.

AIGC systems blur the traditional boundary between tool and collaborator. A word processor is a tool — it does exactly what the user commands. A generative AI system is something more: it introduces novelty, surprise, and sometimes error in ways that the user must interpret, evaluate, and refine. This shifts the user’s role from commander to editor, curator, and critic.

14.2 Design Challenges for Human-AI Interaction

Designing effective human-AI interfaces requires addressing several challenges that do not arise in traditional software:

Transparency and explainability: users need to understand, at least approximately, what the AI can and cannot do, and why it produces particular outputs. A chatbot that occasionally fabricates information (“hallucinating”) is dangerous if users trust it uncritically; designing for appropriate trust requires making uncertainty visible and providing mechanisms for verification.

Control and agency: users must feel that they are directing the interaction, not being directed by the AI. This means providing easy ways to steer, constrain, undo, and override AI outputs. Prompt engineering — the art of crafting effective natural-language instructions for AI systems — is itself an interaction design problem: the “interface” is a text box, but the skill required to use it effectively varies enormously.

Mental models: users form mental models of how AI systems work, and these models are often inaccurate. A user who believes ChatGPT “understands” language in the way a human does will have different expectations (and different frustrations) than one who understands it as a statistical pattern matcher. Designing interfaces that foster accurate mental models — without requiring users to understand machine learning — is an open challenge.

Bias and fairness: AI systems trained on historical data may reproduce and amplify biases present in that data. Interface design can mitigate this by making outputs auditable, providing diverse examples, and allowing users to flag problematic outputs.

Creativity support: when AI is used as a creative partner (in writing, visual art, music, or code), the interface must balance inspiration with control. Showing too many AI suggestions can overwhelm; showing too few can underwhelm. The timing, presentation, and granularity of suggestions all affect the user’s creative flow.

14.3 Guidelines for Human-AI Interaction

Microsoft Research’s “Guidelines for Human-AI Interaction,” published by Amershi et al. at CHI 2019, provide 18 design guidelines organized by when they apply during interaction:

Initially: make clear what the system can do; make clear how well the system can do what it can do.

During interaction: time services based on context; show contextually relevant information; match relevant social norms; mitigate social biases.

When wrong: support efficient invocation; support efficient dismissal; support efficient correction; scope services when in doubt; make clear why the system did what it did.

Over time: learn from user behavior; update and adapt cautiously; encourage granular feedback; convey the consequences of user actions; provide global controls; notify users about changes.

These guidelines bridge the gap between traditional usability heuristics (designed for deterministic software) and the probabilistic, adaptive nature of AI systems.


Chapter 15: Designing for Trust

15.1 Trust in Digital Products

Trust is the willingness to be vulnerable to the actions of another party based on the expectation that the other will perform actions important to the trustor. In digital products, trust operates at multiple levels: trust in the technology (will it work?), trust in the organization (will they protect my data?), and trust in other users (will they behave honestly?).

Joe Gebbia, co-founder of Airbnb, describes how Airbnb faced a fundamental trust problem: convincing strangers to stay in each other’s homes. Their design solutions included a reputation system (reviews and ratings), verified identity, professional photography, and a secure payment system that held funds in escrow. Each feature addressed a specific dimension of trust — competence, integrity, and benevolence — reducing the perceived risk of the transaction.

The design lesson is that trust is not a binary state but a spectrum that can be influenced by design choices. Progressive disclosure of personal information, clear privacy controls, transparent policies, and social proof (showing that others have successfully used the system) all build trust incrementally.

15.2 The Paradox of Choice

Barry Schwartz’s paradox of choice holds that while some choice is better than none, too much choice leads to decision paralysis, anxiety, and dissatisfaction. In a famous experiment by Iyengar and Lepper, shoppers confronted with 24 jam varieties were far less likely to purchase than those offered 6 varieties — even though the large display attracted more initial interest. Schwartz identifies three damaging effects of excess choice: decision paralysis (people avoid choosing at all), buyer’s regret (more alternatives mean more “what if” scenarios), and reduced satisfaction (imagined better options diminish enjoyment of the chosen one).

Schwartz distinguishes between maximizers — people who exhaustively compare all options seeking the objectively best choice — and satisficers — people who set a “good enough” threshold and select the first option that meets it. Research consistently finds that satisficers are happier with their choices, even though maximizers often make objectively better selections. For interface designers, the lesson is that enabling maximizer behavior (showing all options, facilitating exhaustive comparison) may paradoxically reduce user satisfaction.

For interface design, the paradox of choice argues for:

  • Sensible defaults: pre-selecting the most common option reduces the number of decisions users must make.
  • Progressive disclosure: showing only essential options initially, with advanced options available on demand.
  • Categorization: organizing large option sets into meaningful groups reduces the perceived complexity.
  • Recommendations: highlighting a “recommended” option provides an anchor that simplifies comparison.

The tension between choice and simplicity is a recurring design challenge, particularly in settings screens, e-commerce catalogs, and configuration wizards.


Appendix: Key Frameworks and Methods Reference

A.1 Design Thinking (Stanford d.school Five Stages)

StageDescriptionKey Activities
EmpathizeUnderstand users through observation and engagementInterviews, observations, empathy maps
DefineSynthesize research into a clear problem statementPOV statements, HMW questions, affinity diagrams
IdeateGenerate a broad range of creative solutionsBrainstorming, Crazy 8s, SCAMPER, user stories
PrototypeBuild tangible representations to explore solutionsPaper prototypes, wireframes, clickable mockups
TestGather feedback from users to refine solutionsUsability testing, heuristic evaluation, cognitive walkthrough

A.2 Nielsen’s 10 Usability Heuristics

  1. Visibility of system status
  2. Match between system and the real world
  3. User control and freedom
  4. Consistency and standards
  5. Error prevention
  6. Recognition rather than recall
  7. Flexibility and efficiency of use
  8. Aesthetic and minimalist design
  9. Help users recognize, diagnose, and recover from errors
  10. Help and documentation

A.3 Severity Rating Scale for Heuristic Evaluation

RatingDescription
0Not a usability problem
1Cosmetic problem — fix only if time permits
2Minor usability problem — low priority
3Major usability problem — high priority
4Usability catastrophe — must be fixed before release

A.4 Cognitive Walkthrough Questions

  1. Will the user try to achieve the right effect?
  2. Will the user notice that the correct action is available?
  3. Will the user associate the correct action with the desired effect?
  4. If the correct action is performed, will the user see that progress is being made?

A.5 AEIOU Observation Framework

LetterDimensionFocus
AActivitiesGoal-directed sets of actions
EEnvironmentsPhysical and social settings
IInteractionsExchanges between people or between people and objects
OObjectsTools, devices, and artifacts
UUsersPeople, their roles, relationships, and values

A.6 CSCW Time-Space Matrix

Same PlaceDifferent Place
Same TimeFace-to-face (electronic whiteboards, shared displays)Synchronous distributed (video conferencing, shared editors)
Different TimeAsynchronous co-located (shift logs, physical bulletin boards)Asynchronous distributed (email, wikis, version control)

A.7 Fitts’s Law

\[ T = a + b \cdot \log_2\left(\frac{D}{W} + 1\right) \]

Where \( T \) is movement time, \( D \) is distance to target, \( W \) is target width, and \( a \), \( b \) are empirically determined constants. Larger and closer targets are faster to select.

A.8 Hick’s Law

\[ T = a + b \cdot \log_2(n + 1) \]

Where \( T \) is decision time and \( n \) is the number of equally probable alternatives. Fewer choices lead to faster decisions.

A.9 WCAG POUR Principles

PrincipleDescription
PerceivableInformation must be presentable in ways users can perceive
OperableUI components must be operable via various input methods
UnderstandableInformation and operation must be comprehensible
RobustContent must work with current and future assistive technologies

A.10 Norman’s Seven-Stage Action Cycle

StageTypeDescription
1. GoalExecutionForm an intention
2. PlanExecutionDecide on an action strategy
3. SpecifyExecutionTranslate plan into specific actions
4. PerformExecutionExecute the action
5. PerceiveEvaluationObserve the result
6. InterpretEvaluationMake sense of the observation
7. CompareEvaluationEvaluate whether the goal was achieved

Stages 1–4 span the Gulf of Execution; stages 5–7 span the Gulf of Evaluation.

A.11 GOMS Components

ComponentDescriptionExample
GoalsWhat the user wants to accomplishDelete a word
OperatorsElementary actions (KLM times)Keypress (~0.2s), Point (~1.1s), Mental prep (~1.35s), Home (~0.4s)
MethodsSequences of operators for a goalDouble-click to select, press Delete
Selection rulesCriteria for choosing between methodsIf text is short, use Backspace; if long, select-all + Delete

A.12 Microsoft’s Guidelines for Human-AI Interaction (Amershi et al., 2019)

Eighteen guidelines organized by interaction phase: Initially (set expectations), During (be contextually relevant), When Wrong (support correction), Over Time (learn and adapt). These guidelines extend traditional usability heuristics to accommodate the probabilistic nature of AI systems.

Back to top