DAC 305: Digital Game Design and User Experience

Lennart Nacke

Estimated study time: 36 minutes

Table of contents

Sources and References

Primary textbook — Anders Drachen, Pejman Mirza-Babaei, Lennart E. Nacke (eds.), Games User Research (Oxford University Press, 2018). Supplementary texts — Jesse Schell The Art of Game Design; Tracy Fullerton Game Design Workshop; Steve Krug Don’t Make Me Think; Mihaly Csikszentmihalyi Flow. Online resources — Unity Learn (learn.unity.com); Blender Foundation documentation; GDC Vault; IGDA Games User Research SIG; Jakob Nielsen “10 Usability Heuristics”.

Chapter 1 — Digital Game Design as a Discipline

Digital game design is the craft of shaping interactive systems so that a human player, moving through a sequence of choices, experiences something meaningful: a challenge, a story, an aesthetic pleasure, a social connection, sometimes all four at once. Unlike film or prose, where the author can fully anticipate the sequence of images a reader will meet, a game designer builds a possibility space. The player’s path is negotiated in real time between the rules the designer wrote and the goals the player brought with them. The object of study in this course is that negotiation and the methods we use to improve it.

Jesse Schell, in The Art of Game Design, famously defines a game as “a problem-solving activity approached with a playful attitude.” Tracy Fullerton, in Game Design Workshop, emphasizes the formal elements that any game must contain: players, objectives, procedures, rules, resources, conflict, boundaries, and outcome. Raph Koster adds that games are, at their core, engines for teaching patterns; fun is the feedback loop of mastery. Each of these definitions is useful because it points to a different lever the designer can pull. When a playtest report says “this level feels boring,” the design answer might be to add conflict (Fullerton), to tighten the reward schedule (Koster), or to reframe the objective so the player sees it as a problem worth solving (Schell).

Digital games layered on top of this general theory a set of affordances that paper games could not offer. A computer can calculate physics, hide information, remember state, generate content, measure player behaviour down to the millisecond, and present stimuli synchronized to sub-frame precision. It can also get things wrong in ways a board game cannot: frame drops, input lag, save corruption, broken quests. The designer of a digital game therefore inherits both the expressive freedom of software and the fragility of software, and the discipline of games user research (GUR) emerged in part because that fragility matters for the player’s felt experience.

This course treats digital game design as an iterative, evidence-based discipline rather than a purely authorial one. The designer begins with a hypothesis about what will be fun, prototypes the smallest testable version of it, puts it in front of humans, watches what happens, and revises. The authored voice remains — games are still creative works — but it is tempered by observation. You will spend as much time learning how to watch a player as how to script an enemy.

Chapter 2 — The 2D Game: Mechanics, Loops, Feedback

A two-dimensional game is the right starting point for a design course because it forces every decision to be visible. There is no camera angle to hide behind, no particle system to mask ambiguity. If the jump feels bad, the jump is bad, and you must fix the numbers. Flappy Bird, which the course uses as its reference case, is instructive precisely because it is so small. One button, one hazard, one scoring rule, and yet it held a global audience for weeks because the moment-to-moment interaction was tuned to the threshold of frustration and mastery.

The vocabulary for discussing that tuning starts with the word mechanic. A mechanic is a single rule plus the verb it gives the player: tap to flap, hold to charge, release to fire. Mechanics compose into systems when they interact: flapping gains altitude, gravity removes it, the pipes scroll, and the interaction of those three simple rules creates a difficulty curve nobody explicitly authored. Designers speak of emergent gameplay when system interactions produce situations the designer never scripted; a good 2D game often has more emergence than its scale suggests.

Around the mechanic sits the core loop, the smallest cycle of action and reward the player repeats. In Flappy Bird the loop is flap, clear pipe, see score increment. In a platformer it might be see hazard, plan route, execute, land safely. In a roguelike it is run, die, unlock, run again. The designer’s job is to make this loop intrinsically satisfying — each iteration should feel complete — and also to stack secondary loops on top of it that pay out on longer time scales: a level’s worth of runs, a session’s worth of levels, a week’s worth of sessions. Schell calls this the problem of nested rewards. Fullerton frames it as balancing short-term and long-term goals.

Feedback is what holds loops together. When the player taps the button, the bird must respond within one or two frames, with sound, animation, and motion that together assert “you did that.” Steve Swink’s Game Feel calls the total sensory return on an input the game’s “juice.” Too little juice and the game feels sluggish or uncertain; too much and the player is bludgeoned. The designer learns to tune input latency, animation anticipation (the wind-up before an action), follow-through (the decay after), camera shake, and sound envelope as a single gestalt. In playtests, the single most common complaint about early prototypes is simply that “something is off about the movement,” which almost always decomposes into one of these variables.

A useful taxonomy of 2D game feel comes from asking four questions of every action. Is the input read at the right moment? Is the response immediate? Is the response legible — can the player see it? Is the response proportional — does a bigger input give a bigger result? If any of these four fail, the mechanic will feel broken even if it is technically correct. This is the scaffold you will bring to Checkpoint 1 of the course, where you analyze a published game’s core mechanic.

Chapter 3 — Game Engines and the Unity Ecosystem

A modern game engine is a collection of runtime services — rendering, physics, audio, input, asset streaming, scripting — bundled with an editor that lets a designer arrange scenes visually. Unity, Unreal, Godot, and GameMaker are the engines most students encounter. Unity is this course’s platform because its 2D tooling is mature, its scripting language (C#) is approachable, and its Asset Store and Learn portal make it easy to assemble a working prototype in hours rather than weeks.

A Unity project is organized around three mental models you must internalize early. First, the scene: a container of objects representing a level, menu, or self-contained chunk of the game. Second, the GameObject: a node in the scene that has no behaviour of its own but can carry components that give it behaviour — a Transform for position, a SpriteRenderer for a 2D image, a Rigidbody2D for physics, a Collider2D for collision volume, an AudioSource for sound, and custom MonoBehaviour scripts for anything you write yourself. Third, the prefab: a reusable template of a GameObject plus components, which you can instance many times and update centrally. When a designer says “the enemy prefab,” they mean the canonical template, changes to which propagate to every enemy in the game.

The Unity game loop is important conceptually even if you never write engine code. Every frame, Unity calls Update on every active script, then a fixed-timestep FixedUpdate for physics, then a LateUpdate for things that should happen after everything else moved (like cameras following a player). Understanding this ordering prevents a whole class of bugs: if you move a character in Update but check collisions in FixedUpdate, you can get jitter; if you move a camera before the player moves, the camera will lag by a frame. The Unity Learn tutorials walk through these concepts in small playable examples, and Mike Geig’s Sams Teach Yourself Unity in 24 Hours gives a similarly paced tour.

For 2D work specifically, Unity offers a Tilemap system for grid-based levels, a Sprite Renderer with layer ordering and sorting groups, a 2D Physics module with its own Rigidbody2D and Collider2D family, and an Animator controller for state-machine-driven sprite animation. Joshua-style student teams typically underestimate how much of a 2D game can be built without a single hand-written line of movement code, simply by configuring these components. That is deliberate: Unity rewards the designer who treats code as a last resort, used for the irreducible logic, and configuration as the default.

A mature Unity workflow also includes source control from day one. The default choice is git with the Unity-specific .gitignore and Git LFS for binary assets. Teams that skip this in week one spend week ten recovering corrupted scenes. The IGDA and Unity Learn both emphasize version control as a non-negotiable pillar of any shared project.

Chapter 4 — 3D Assets in Blender: A Pipeline for Unity

Even a 2D game often benefits from a 3D asset pipeline, because 3D lets you render sprites from multiple angles, bake lighting into textures, and iterate on a character’s silhouette without redrawing every frame by hand. Blender is the course’s 3D tool of choice because it is free, open source, exceptionally well documented, and because its exporters to Unity are robust. The Blender Foundation’s official manual and the LinkedIn Learning Blender Essential Training courses are the right places to learn the tool at depth; what follows is the mental model you need to make good decisions about how to prepare assets for a game.

A usable Blender-to-Unity pipeline has five stages. The first is modeling, where you block out the shape of the object in low polygon counts, focusing on silhouette and proportion before detail. Schell’s lens of the silhouette is relevant here even in 3D: if the player cannot recognize the object from its outline alone, add nothing until that is fixed. The second stage is UV unwrapping, where you flatten the 3D surface into a 2D layout so a texture can be painted onto it. Good unwraps waste little texture space and avoid seams in visible places. The third stage is texturing, usually via image textures in Blender’s shader editor or via Substance-style procedural workflows. For game use, you want textures at power-of-two resolutions (256, 512, 1024) and in formats Unity can compress well.

The fourth stage is rigging and animation, where a skeleton of bones is bound to the mesh so the character can move. Game rigs are simpler than film rigs: fewer bones, no complex muscle systems, and a strict eye toward keyframe count and blend shapes because every animation will be played by the runtime. The fifth stage is export, typically via FBX or the newer glTF 2.0 format. Unity imports both, but FBX is still the industry default for skinned meshes with animation. Before exporting, you apply transforms (so the object’s local origin is sensible), check the scale (Unity’s unit is 1 metre), and verify that the normals face outward. When the asset arrives in Unity it should drop into a scene and simply work.

The discipline to learn in this chapter is pipeline thinking: a decision made in Blender will ripple into Unity’s import settings, the runtime’s draw calls, and ultimately the frame rate on a player’s device. A high-poly hero model is not a virtue if it costs the player ten frames per second. A designer who understands the whole pipeline asks, at each stage, what the minimum asset is that still reads correctly on screen. This restraint is more important for playability than any amount of visual polish.

Chapter 5 — Iterative Design and Prototyping

Fullerton, in Game Design Workshop, organizes the whole craft of game design around a single loop: set goals, generate ideas, formalize, test, evaluate, revise. The loop is deliberately small. You are not supposed to run it once a semester; you are supposed to run it every few days, or every few hours once a prototype exists. Each iteration answers one or two specific questions about the design, and the questions sharpen as the prototype matures.

Prototypes come in tiers. A paper prototype uses index cards and dice to rehearse rules before any code is written; it is cheap, fast, and devastatingly effective for catching broken resource economies and ambiguous rules. A digital greybox is a Unity scene with blocky placeholder art that implements one mechanic end-to-end; its job is to answer “does this feel good to do?” A vertical slice is a single polished level that looks and plays the way the final game will, used to align the team and convince external stakeholders. A horizontal prototype is a wide, shallow sketch of all the features at once, useful for checking whether the whole game fits together. Different questions want different prototype tiers, and a common student error is to over-polish a greybox when a paper prototype would have sufficed.

The governing ideas behind iteration are the cost of change and the value of information. Every design decision gets more expensive to revise as the project matures, so you want the big decisions — core mechanic, core loop, target audience — to be stress-tested with cheap prototypes before they get locked in. Information is most valuable early, when it can still redirect the project. Mirza-Babaei and colleagues in Games User Research call this the “early and often” principle; Schell talks about the danger of falling in love with your first idea; Fullerton frames it as protecting the player’s experience from the designer’s ego. All three are making the same point: the iteration loop exists to surface evidence that might contradict you, and it only works if you run it before you have committed too much to hear the answer.

Chapter 6 — The Player Experience: Flow, Immersion, Engagement

The design work of earlier chapters aims at something — a state of mind in the player. The most influential frame for that state is Mihaly Csikszentmihalyi’s flow, introduced in his 1990 book Flow: The Psychology of Optimal Experience. Flow is the experience of being fully absorbed in an activity that matches your skill to its challenge: time distorts, self-consciousness recedes, and action feels effortless. Csikszentmihalyi identified several conditions — clear goals, immediate feedback, a balance between challenge and skill, a sense of control, and the merging of action and awareness — which map almost one-to-one onto good game design principles. When a game tunes its difficulty curve, what it is doing in flow terms is keeping the player inside the narrow channel where the challenge rises just fast enough to meet their growing skill.

Flow is not the whole story. Jenova Chen’s master’s thesis on dynamic difficulty, and later work by Nacke and colleagues on GameFlow, adapted Csikszentmihalyi’s criteria specifically for video games and added player-centric factors like social interaction and the need for clear, unambiguous controls. Other experiential frames matter too. Immersion, as theorized by Brown and Cairns, unfolds in three stages — engagement, engrossment, total immersion — distinguished by how much external awareness the player retains. Presence, a construct borrowed from VR research, measures the feeling of “being there” in the game’s world. Engagement, a looser umbrella term, covers sustained attention across sessions and is often operationalized via retention metrics in industry.

The practical upshot is that a designer should not ask “is my game fun” but rather a stack of more specific questions. Does the moment-to-moment control give immediate feedback? Does the challenge curve respect the player’s growing skill? Do the goals at every scale — action, encounter, level, campaign — feel clear and worth pursuing? Is the player pulled in by the world’s fiction, the mechanics’ depth, or the social frame, and is the game giving each of those channels enough support? Games User Research calls the composite answer the player experience (PX), and the rest of this course is about measuring it.

Chapter 7 — What Is Games User Research?

Games user research is the systematic study of how players interact with games, undertaken to inform design and development decisions. Drachen, Mirza-Babaei, and Nacke, in the opening chapter of Games User Research, frame the field as sitting at the intersection of human-computer interaction (HCI), user experience research, psychology, and game design. GUR is not QA: a QA tester asks whether the game crashes, while a GUR researcher asks whether the player understood the tutorial. GUR is not analytics alone either: dashboards full of telemetry are one input, but they are almost useless without qualitative context to explain why the numbers moved.

The core commitment of GUR is that a game’s quality is not an intrinsic property of the code but an emergent property of the encounter between code and player. You therefore cannot judge a game by looking at it; you must watch it being played, by people who represent the intended audience, under conditions resembling real play. The closer the study conditions come to real play, the more the observations generalize; the more controlled the conditions, the more precisely you can isolate a single variable. Every GUR method is a trade-off along this ecological-validity versus experimental-control axis.

A mature GUR practice distinguishes between formative and summative evaluation. Formative studies happen during development and aim to improve the game. Their output is a prioritized list of issues the team can fix before ship. Summative studies happen near or after release and aim to judge the game, often against benchmarks, business goals, or academic research questions. The methods can overlap, but the questions you ask and the stakeholders you report to differ considerably. A student project in this course will almost always be doing formative GUR: you are trying to make a specific mechanic better, not deliver a final verdict on its merit.

The field is professionally organized through the IGDA Games User Research Special Interest Group and the annual GUR Summit that precedes GDC. Its methodological standards draw heavily from HCI (Nielsen, Dumas and Redish) and experimental psychology, while its domain knowledge comes from the game industry. A student entering GUR is entering a community with real norms: pre-registration of hypotheses where appropriate, informed consent, ethical review, transparent reporting, and the humility to say “we don’t know yet” when the data does not support a strong conclusion.

Chapter 8 — Quantitative Methods in GUR: Telemetry and Metrics

Quantitative GUR treats the game as an instrument that records its own use. Every button press, state change, death, purchase, and quit becomes a row in a database; analysis extracts patterns from millions of such rows. Drachen and colleagues devote several chapters of Games User Research to this subfield, which they call game analytics or telemetry-based research.

The workflow begins with instrumentation: deciding what events the game will log. A good event schema captures enough to reconstruct the important parts of a session — which level, when, for how long, with what outcome, along with any relevant state — without drowning the pipeline in noise. Common event types include session start and stop, level enter and exit, death and respawn, item pickup, economy transactions, tutorial step completion, and user-initiated settings changes. For each event you record a timestamp, a player identifier, and a small payload of context. The payload design is where quantitative studies succeed or fail: a well-designed payload lets you slice the data later in ways you did not anticipate.

Once instrumentation is in place the data is aggregated into metrics. Industry uses a standard vocabulary — daily active users, session length, retention (day-1, day-7, day-30), average revenue per user, completion rate, churn rate — and these are fine for monitoring a live game, but for research you usually want more specific metrics tied to a hypothesis. For a single mechanic under study, good metrics might be median time-to-first-success, success rate per attempt, attempts before abandonment, and the distribution of failure reasons. The key methodological point from Drachen is that metrics answer what and how often but almost never why; to answer why, you need qualitative follow-up.

Heatmaps and spatial analytics deserve a special mention because they are one of the few quantitative tools that are also immediately legible to designers. Plotting every death in a level on top of the level’s map will instantly show you the spikes that are killing too many players. Plotting every camera trajectory will show you where players looked. Plotting every interaction with an NPC will show you whose dialogue is being skipped. Spatial views turn rivers of numbers into an image a designer can act on in minutes.

The main risks of quantitative GUR are threefold. First, vanity metrics: numbers that are easy to collect, impressive to report, and unrelated to player experience. Second, survivorship bias: the telemetry only sees players who are still playing, so your data systematically underrepresents the people you most need to hear from. Third, statistical overreach: with millions of events, even trivial differences become statistically significant, and reports can confuse significance with importance. The antidote to all three is to tie every metric back to a design question and to pair it with qualitative evidence.

Chapter 9 — Qualitative Methods in GUR: Interviews, Think-Aloud, Surveys

Qualitative methods are how GUR answers why. They are slower, smaller, and irreplaceable. Games User Research dedicates multiple chapters to these methods because in student-scale projects, where you cannot collect millions of telemetry rows, qualitative methods are often the only practical source of evidence.

The think-aloud protocol asks a player to play the game while narrating their thoughts continuously. It originated in cognitive psychology via Ericsson and Simon’s 1984 book Protocol Analysis and migrated to HCI through Nielsen’s usability work. There are two variants. Concurrent think-aloud captures reactions in the moment, at the cost of interfering slightly with the play experience (it is hard to narrate and execute a difficult action at the same time). Retrospective think-aloud records the play session and then replays it with the player, who narrates afterwards, preserving the quality of play at the cost of memory fidelity. Many GUR labs use a hybrid, letting the player narrate when calm and replaying the hard parts afterwards. The researcher’s role is to prompt minimally — “what are you thinking?” — and to avoid leading questions that bias the response.

Semi-structured interviews follow a prepared guide of open-ended questions but allow the researcher to follow interesting threads. Good interview questions are neutral (“walk me through what happened when you tried to defeat the boss”), specific (“what did you expect to happen when you pressed the jump button near the ledge?”), and non-leading (“tell me about the moment you stopped playing” rather than “was it frustrating?”). The interview typically happens immediately after a play session, while the experience is fresh. Transcription and thematic coding, following Braun and Clarke’s widely used method, turn dozens of hours of interview data into a handful of themes the team can act on.

Surveys and questionnaires offer the reach of quantitative instruments with the subjectivity of qualitative ones. Standardized PX questionnaires — the Game Experience Questionnaire (GEQ) from IJsselsteijn and colleagues, the Player Experience of Need Satisfaction (PENS) scale rooted in self-determination theory, the System Usability Scale (SUS) for interface usability — let you compare your game’s scores against published benchmarks. Custom Likert-scale items let you ask targeted questions your hypothesis demands. Two rules from the GUR literature: first, validated scales beat custom items whenever the construct you care about is already covered; second, long questionnaires produce worse data than short ones because tired participants satisfice.

Field studies and diary studies take the research out of the lab and into the player’s home. A diary study asks players to log short reflections after each session over days or weeks, capturing how the experience evolves as novelty fades. This is hard to do with a student timeline, but it is the gold standard for understanding engagement over time.

A mature project triangulates. Telemetry shows that level three has an 80 per cent abandonment rate; think-aloud sessions show that players cannot find the door; a post-session interview reveals that they thought the door was background scenery; a survey confirms the confusion generalizes across demographics. Each method alone is ambiguous; the combination is conclusive.

Chapter 10 — Playability Heuristics and Heuristic Evaluation

Jakob Nielsen’s 1994 essay “10 Usability Heuristics for User Interface Design” is the foundation stone of heuristic evaluation. The ten heuristics — visibility of system status, match between system and real world, user control and freedom, consistency and standards, error prevention, recognition rather than recall, flexibility and efficiency of use, aesthetic and minimalist design, help users recognize and recover from errors, help and documentation — are phrased so generally that they apply to almost any interactive system. In a heuristic evaluation, three to five expert evaluators independently walk through an interface, noting any violations of the ten principles, and the union of their findings is merged into a prioritized list. Nielsen showed that three to five evaluators typically catch about 75 per cent of the issues a full user study would surface, at a fraction of the cost.

Nielsen’s heuristics were designed for productivity software and do not fully cover games, whose experiential goals go beyond “complete the task efficiently.” Heather Desurvire and collaborators introduced Heuristic Evaluation for Playability (HEP) in 2004, and the later PLAY (Principles of Game Playability) and HEP 2.0 frameworks refined it. HEP adds heuristics organized under four categories: game play (the problems and challenges the player must solve), game story (the narrative and character development), game mechanics (the programming that provides the structure), and game usability (the interface and controls). Example HEP items include “players should be given unblocked views appropriate for their current task,” “the game should be enjoyable to replay,” and “the game should react in a consistent, challenging, and exciting way to the player’s actions.”

A playability heuristic evaluation is a cheap, fast, formative method that belongs early in any design process. You do not need participants, scheduling, or consent forms: you need experienced evaluators and a prototype. The outputs are lists of suspected problems, classified by severity and by which heuristic they violate, and they are most useful as a filter before you bring in actual players. You should not confuse a heuristic evaluation with a user test — the evaluators are not the audience, and their intuitions can be wrong — but you can use it to clean up obvious problems so that precious user testing time is spent on subtler questions.

Steve Krug’s Don’t Make Me Think is not a games book, but its central insight — that the designer’s job is to eliminate every unnecessary cognitive cost at the interface — is directly applicable. A menu that confuses the player, an icon that means two things, an input that sometimes works and sometimes does not: each of these adds a small tax to every action, and the taxes compound into frustration long before the player consciously notices them.

Chapter 11 — Planning a User Study

A user study is not a casual “come play my game and tell me what you think” session. It is a small experiment, and it needs the same rigor as any experiment even when the scale is modest. Planning a study has six main activities, each of which Games User Research covers in its methodological chapters.

First, define the research question. A good question is specific, actionable, and tied to a design decision. “Is my game fun?” is not a research question. “Do players understand that the green crystal refills health within their first ten seconds of seeing it?” is. A research question should almost always imply a metric or an observable behaviour that would count as an answer.

Second, choose the method that fits the question. Open-ended “why” questions call for think-aloud and interviews. Comparison questions (“is version A easier than version B?”) call for controlled experiments. Frequency questions (“how often do players try the sword before the bow?”) call for telemetry. Mixed-method designs combining two or three of these are the norm.

Third, define the participants. Who is the target audience? A game for ten-year-olds should not be tested on graduate students. Sample size depends on the method: five to eight participants per condition is a common rule of thumb for formative qualitative studies, while quantitative experiments need power calculations. Recruitment should draw from the intended audience as closely as logistics allow, and should avoid the designer’s friends, who will be too kind.

Fourth, design the protocol. Write the script: what you will say when the participant arrives, how you will brief them on the think-aloud, which tasks you will ask them to complete, how you will intervene or not when they get stuck, and what questions you will ask in the post-session interview. Write the consent form explicitly describing what data will be collected, how it will be stored, how it will be used, the participant’s right to withdraw, and the procedures for handling any distress. For studies at universities this usually involves an ethics review; for commercial studies there are industry codes through the IGDA GUR SIG.

Fifth, pilot the protocol. Run the whole study once or twice with colleagues before you bring in real participants, because you will always find broken questions, unclear instructions, recording failures, and tasks that take three times longer than you planned. A single pilot session saves more real sessions than any other investment.

Sixth, plan the analysis before collection. If you are going to code transcripts for themes, decide the coding categories in advance where possible. If you are going to compare two conditions, decide the primary metric before you see the data. This prevents the researcher from chasing whatever happens to look significant in the collected data, a practice Andrew Gelman has called “the garden of forking paths.”

A small, well-planned study beats a large, improvised one every time. The student Checkpoint 2 in this course asks exactly this: write a plan for testing your mechanic that a skeptical reviewer would find defensible before you have collected any data.

Chapter 12 — Running and Analyzing a Playtest

On the day of the playtest, the priority shifts from design to craft. You are now a researcher, and every participant deserves a consistent, respectful, well-run session.

Before the participant arrives, prepare the room. In a lab this means checking that the recording software captures the screen, the face, and the microphone; that the game is at the correct version; that the mouse and keyboard are plugged in and clean; that the consent forms are printed; and that the task list and interview guide are in the researcher’s hand, not on the same screen the participant will use. In a remote study — which became common after 2020 — the same preparation applies but through whatever screen-sharing and recording platform you are using, plus contingency plans for network failures.

When the participant arrives, greet them, thank them, and read the brief verbatim rather than paraphrasing. The brief should explain the study’s purpose in plain language, reassure the participant that you are testing the game and not them, describe the think-aloud procedure, set expectations for session length, and invite questions. Obtain consent before any data is captured. During the session the researcher’s job is to observe quietly, note events with timestamps, and intervene only when a participant is clearly stuck in a way that is outside the study’s scope. Resist the urge to explain or defend the design. Your silence is data: when a player does not understand something and you do not help them, you learn exactly how badly the design fails to teach itself.

After the session, debrief. Ask the prepared interview questions, follow interesting threads, and end with a thank-you and any compensation you promised. Make notes immediately, while the session is fresh — memory decays quickly and even a two-hour delay will lose nuances. When all sessions are complete, transcribe any recorded interviews or at least create timestamped notes of the important moments.

Analysis proceeds from data to findings to recommendations. For qualitative data, thematic analysis following Braun and Clarke is the dominant approach: read the material once for immersion, generate initial codes on a second pass, group codes into candidate themes, review the themes against the raw data, and finalize a small set of themes that explains the data economically. For quantitative data, compute the pre-registered metrics, plot distributions before testing any hypotheses, and use appropriate statistical tests (non-parametric tests are usually safer given the small samples typical of student GUR). For mixed-method studies, bring the quantitative and qualitative findings into conversation: a heatmap that matches an interview theme is far more persuasive than either alone.

Severity ratings help turn findings into prioritized recommendations. Nielsen’s scale — cosmetic, minor, major, catastrophic — is adequate for most student work. A catastrophic issue blocks a significant fraction of players from completing a core task; a cosmetic issue mildly annoys a minority. Recommendations should be specific and actionable: “add a highlight to the door in level three when the player looks at it for more than two seconds” is a recommendation a developer can implement; “improve level three” is not.

Chapter 13 — Reporting and Acting on User Research

A GUR report exists for one reason: to change a decision. If the report is filed and forgotten, the research was wasted no matter how methodologically elegant it was. Drachen, Mirza-Babaei, and Nacke devote their closing chapters to the communication side of GUR, and the central message is that reports must be designed for their readers as carefully as games are designed for their players.

The conventional report structure has seven sections. An executive summary on the first page lists the top three to five findings and recommendations, because busy producers will read nothing else. A background and research questions section frames the study: what was the design in question, what did we want to learn, how does this connect to decisions on the table. A method section describes what was done in enough detail that a skeptical reader could evaluate whether to trust the results — participants, procedure, materials, analysis. A findings section presents each finding with evidence, severity rating, and illustrative quotes or screenshots. A recommendations section translates findings into specific, actionable changes, each tied to the finding that motivates it. A limitations section acknowledges what the study could not see — small sample, narrow demographic, prototype bugs that may have confounded results. An appendix holds raw materials: interview guides, consent forms, coding schemes, full statistics.

Within this structure, certain craft choices separate useful reports from ignored ones. Prefer specific findings over general impressions: “seven of eight participants failed to notice the health crystal on their first encounter” lands harder than “the tutorial was confusing.” Quote participants to give readers a human voice to attach to the numbers. Use images — heatmaps, screenshots with annotations, key-moment photographs — because game developers are visual thinkers and a well-chosen image can do the work of a paragraph. Rank recommendations by a combination of severity and implementation cost, so the team can start with the quick wins.

The hardest part of GUR is not the research but the handoff. Developers are defensive about their work for understandable reasons — they love it and they have been living with it for months — and a report that reads as criticism will be rejected even when its findings are correct. Experienced GUR researchers learn to frame findings as shared problems rather than designer failures, to present alongside the team rather than across from them, to acknowledge what the game already does well, and to keep the player at the centre of the conversation. When a developer hears a direct quote from a confused player, they almost always want to help that player, and the recommendation follows naturally from there. The report’s goal is to put developer and player on the same side.

Finally, research findings must feed back into the iteration loop of Chapter 5. A report that is merely read is a dead artifact; a report that changes the next sprint’s backlog has justified every hour of study work. The final deliverable of this course — a user research report on a game mechanic you prototyped and tested — is not a certificate of completion but an instrument for a hypothetical next iteration. A good report ends not with a verdict but with a set of better questions the team could ask if they had another two weeks.

Closing Note

The twelve chapters above trace a single arc. A designer begins with an idea about a mechanic, builds the smallest testable version of it, puts it in front of players, watches what happens, and revises. Every tool in the course — Unity for prototyping, Blender for assets, telemetry for quantitative observation, think-aloud and interviews for qualitative observation, heuristic evaluation for cheap formative checks, planned user studies for rigorous ones, reports for acting on findings — is in service of that single loop. The theories of flow, immersion, and engagement give the loop its target. The ethical and methodological commitments of games user research keep the loop honest. What makes a designer is not the authorial voice alone but the willingness to run the loop one more time when the evidence says the game could still be better.

Key takeaways.
  • Games are negotiations between the designer's rules and the player's goals; the job of design is to shape the possibility space, not to script the experience.
  • Core loops and feedback are the atoms of felt quality in a 2D game. Juice, latency, legibility, and proportionality are the four tuning dials for any mechanic.
  • Unity rewards the designer who treats code as a last resort and configuration as the default; pipeline thinking from Blender through Unity determines runtime performance.
  • Iteration works only when prototypes are cheap enough to risk, and only when the designer actually listens to the evidence surfaced by early tests.
  • Player experience is not the same as fun. Flow, immersion, presence, and engagement each name a different construct and each calls for different measurements.
  • Games user research triangulates quantitative telemetry, qualitative think-aloud and interviews, standardized questionnaires, and heuristic evaluation. No single method is sufficient.
  • Heuristic evaluation (Nielsen, Desurvire HEP) is the cheapest first pass and should be run before any user test, not after.
  • Planned studies beat improvised ones. Pre-register questions and analysis, pilot the protocol, respect participants, and resist the urge to defend the design during sessions.
  • Reports exist to change decisions. Specific findings, direct quotes, annotated images, and ranked recommendations are the craft elements that separate a report that is read from a report that is acted on.
  • The loop never stops. A good final report ends not with a verdict but with the next set of questions worth asking.
Back to top