When 20,000 People Play Together

When 20,000 People Play Together

User Research for Multiplayer Games at Sphere

User Research for Multiplayer Games at Sphere

Company

Company

The Sphere

The Sphere

My Role

My Role

Lead User Researcher

Lead User Researcher

Overview

Overview

At Sphere, I led user research to turn 20,000-person audiences into active players—using phones and massive screens for real-time games. With no precedent, I built a research system from scratch: defining design theses, running large-scale playtests, and prioritizing high-impact unknowns to guide development at scale.

Tool Stack

Context

Context

In 2021, I joined Sphere’s R&D team to explore how technology could transform passive viewers into active participants. We set out to design live multiplayer games for crowds of up to 20,000— exploring a range of different formats:

Competitive asteroid shootouts

Competitive asteroid shootouts

Collaborative world-building

Collaborative world-building

Choreographed stadium waves

Choreographed stadium waves

While the game mechanics varied, my research process remained consistent.

What follows isn’t a detailed case study (to respect confidentiality), but a breakdown of my research approach—one built for navigating ambiguity and applicable to any product with high stakes and unknowns.

1

Define a Design Thesis

Define a Design Thesis

When starting with a blank canvas, our first move was alignment. I gathered key stakeholders—the VP of Interactive, Creative Director, and Technical Director—for focused workshops to ask:

What?

To get clear on purpose, scope, and success.

What are we trying to create?
What signals will tell us we’re on track?
What does success look like?
What experience should it deliver?

Each project followed a thesis driven approach

Here’s what it looked for

Thesis

Thesis

How might we design a quick, pre-show game that’s playful, social, and effortless to join?

How might we design a quick, pre-show game that’s playful, social, and effortless to join?

Success Criteria

Success Criteria

Visible participation, crowd reactions, and players talking about it after the game.

Visible participation, crowd reactions, and players talking about it after the game.

Signals

Signals

80%+ onboarded in under 60 seconds , audible spikes during reveal moments and high replay intent.

80%+ onboarded in under 60 seconds , audible spikes during reveal moments and high replay intent.

2

Build a Shared Vocabulary

Build a Shared Vocabulary

From our early brainstorming, we defined what a great experience should feel like. I turned these into clear benchmarks—giving teams a shared language to design, test, and measure against as we built.

Simplified Onboarding

Simplified Onboarding

Quick, intuitive entry—regardless of age or gaming experience.

Quick, intuitive entry—regardless of age or gaming experience.

Split Attention

Split Attention

Guide focus between phone and Sphere without overload.

Guide focus between phone and Sphere without overload.

Player Agency

Player Agency

Make every player feel their input matters in a crowd of 20,000.

Make every player feel their input matters in a crowd of 20,000.

Socialization

Socialization

Create moments that break the silence between strangers.

Create moments that break the silence between strangers.

3

Prioritizing What Matters

Prioritizing What Matters

To focus our efforts, I created a Research Impact Matrix—a simple tool to align on what to tackle first. It helped us prioritize high-impact unknowns, deprioritize low-value tasks, and guide sprint planning and milestone decisions.

Confidence (X): How certain we already are about the answer?
Impact (Y): How much the insight could shift design or strategy?

Low Impact

High Impact

Low Confidence

High Confidence

Low Impact

High Impact

Low Confidence

High Confidence

Can players split attention between phone and sphere's screen?

Do rewards like drink points improve motivation to join?

Would players prefer team colors or randomized avatars?

Do players understand how to scan a QR code to join?

1ST PRIORITY

Low Confidence, High Impact

Investigate first, these unknowns could make or break the experience.

1ST PRIORITY

Low Confidence, High Impact

Investigate first, these unknowns could make or break the experience.

2ND PRIORITY

High Impact, High Confidence

Quick wins, validate lightly and move forward.

2ND PRIORITY

High Impact, High Confidence

Quick wins, validate lightly and move forward.

3RD PRIORITY

Low Confidence, Low Impact

Explore only if extra time or resources are available.

3RD PRIORITY

Low Confidence, Low Impact

Explore only if extra time or resources are available.

LAST PRIORITY

Low Impact, High Confidence

Document and move on, not worth active focus.

LAST PRIORITY

Low Impact, High Confidence

Document and move on, not worth active focus.

4

Identify Target Audience

Identify Target Audience

To design effectively, we segmented our audience into key groups. This not only grounded our approach in real player behavior, but also helped us validate assumptions through targeted testing.

Bullseye Group

Tech-savvy, social, fast-paced players. If it didn’t work for them, it wouldn’t work at all.

Edge Group

Easily distracted or new to games. If they struggled, we flagged usability gaps.

Core Group

Casual, curious players. If it worked for them, we were ready to scale.

Testing at Scale

Testing at Scale

Unlike typical digital experiences, Sphere’s audience wasn’t hypothetical—it filled a stadium. Replicating this during testing was a core challenge. We broke the research into three key stages:

STAGE 1

STAGE 1

Usability (4–5 participants)

Usability (4–5 participants)

Lab tests to validate onboarding, clarity, and core interactions.

Lab tests to validate onboarding, clarity, and core interactions.

STAGE 2

STAGE 2

Group Behavior (20–40 people)

Group Behavior (20–40 people)

Studied how small groups reacted in sensory-rich environments—tracking timing, social influence, and attention shifts.

Studied how small groups reacted in sensory-rich environments—tracking timing, social influence, and attention shifts.

STAGE 3

STAGE 3

Simulated Live Crowd (300 people)

Simulated Live Crowd (300 people)

Ran large-scale venue tests to capture real-time reactions in chaotic, show-like settings. We analyzed how emotions like surprise and laughter spread, and designed cues to enhance those shared moments.

Ran large-scale venue tests to capture real-time reactions in chaotic, show-like settings. We analyzed how emotions like surprise and laughter spread, and designed cues to enhance those shared moments.

Getting Buy In

Getting Buy In

To turn insights into action, I combined storytelling with evidence—mixing quotes, data, and visuals. This helped leadership spot patterns fast and make confident decisions.

Surveys

Surveys

to gather broad sentiment

to gather broad sentiment

Post-playtest interviews

to explore deeper context

to explore deeper context

Live behavioral tagging

Live behavioral tagging

to capture in-the-moment reactions

to capture in-the-moment reactions

"

"

Playtesting clips and quotes

Playtesting clips and quotes

to humanize the data

to humanize the data

Pattern analysis

Pattern analysis

to surface recurring themes

to surface recurring themes

Behavioral Personas

Behavioral Personas

Another powerful way of humanizing data was creating behavioral personas, helping the team design from a player-first perspective instead of assumptions.

Dizzy Dory

Dizzy Dory

The social butterfly.

The social butterfly.

Here to vibe with friends, cocktail in hand. Easily distracted and often misses key onboarding cues.


Design Tip: Prioritize clarity, tutorial reinforcement, and forgiving UX.

Here to vibe with friends, cocktail in hand. Easily distracted and often misses key onboarding cues.


Design Tip: Prioritize clarity, tutorial reinforcement, and forgiving UX.

Turbo Tim

Power Player

Power Player

Quick to learn, competitive, loves pushing limits (sometimes breaks things on purpose).


Design Tip: Design for mastery and edge cases, without breaking the experience for others.

Quick to learn, competitive, loves pushing limits (sometimes breaks things on purpose).


Design Tip: Design for mastery and edge cases, without breaking the experience for others.

Low-Key Linda

The Passive Plus-One

The Passive Plus-One

Chaperoning her kids. Observes more than interacts, joins only if prompted.


Design Tip: Use spectacle, social momentum, and low-effort engagement paths.

Chaperoning her kids. Observes more than interacts, joins only if prompted.


Design Tip: Use spectacle, social momentum, and low-effort engagement paths.

Moving Forward: A Triple-Check System

Moving Forward: A Triple-Check System

Success is not binary. A strong playtest doesn’t always mean long-term success—and early confusion doesn’t always spell failure. To make better calls, I used a triple-check framework to validate decisions at each milestone.

This system ensured that decisions were grounded in research, supported by those building, and aligned with strategic goals—not just based on a singular opinion.