Any scenario can be completed by a person or by an artificial intelligence (AI) agent in the operator seat. This page covers the person side. There are two reasons to put people through a scenario, and they happen at different points in a scenario’s life:

Capture expertise

Run experienced people through a scenario to record how they handle it, and to stabilize the scenario design itself.

Measure at scale

Assign stable scenarios to a group of operators, track every session, and score them all against the same criteria.
A scenario moves through three lifecycle stages: draft (still being designed), extracting (expert sessions are surfacing what the scenario should test), and stable (the design has settled). Only stable scenarios go into cohorts. Expert reviews are how a scenario earns that status.

Capture expertise: expert reviews

An expert review is a session where an experienced person runs a scenario to demonstrate strong handling. The session records the questions they ask, the decisions they make, and the order they do it in. At the same time, the session tests the scenario itself: if an expert gets confused, the problem is usually the scenario.

Before the session

1

Pick a scenario that is not yet stable

Draft and extracting scenarios are the ones that benefit from expert sessions. Stable scenarios are done with this phase.
2

Brief the expert

Explain what the scenario is about and what kind of session to expect. Do not reveal the hidden state, the information the persona holds back until it is uncovered. If the expert knows it in advance, the session shows nothing about how they would discover it. The scenario’s briefing covers what the expert is allowed to know.
3

Get ready to observe

Your job during the session is to take notes, not to participate.

During the session

The expert works through the scenario in a conversational interface, by text or voice. They ask questions, give guidance, and make decisions; the persona responds from its defined state. Watch for these signals:
What you seeWhat it means
The expert asks questions that uncover hidden stateThe scenario rewards good technique, as designed
The expert seems confusedThe scenario setup is unclear or unrealistic
The conversation feels naturalThe persona and state are well designed
The expert walks past key informationThe hidden state may be too hard to reach
Do not coach during the session. Resist the urge to hint or steer, and save all discussion for afterwards. The point is to see the expert’s real approach, and a hint contaminates exactly the behavior you came to record.

After the session: the guided reflection

When the session ends, an AI interviewer walks the expert through a short reflection. It is adaptive rather than a fixed questionnaire: it reads the session transcript, asks one question at a time, follows up on the answers, mixes question formats (yes/no, ratings, open text, multi-select), and wraps up after 3 to 4 questions. The reflection produces structured insights:
  • Identified gaps - places where the scenario’s persona, context, or hidden state fell short
  • A realism score - the expert’s rating of how true to life the scenario felt
  • Key takeaways - what to change before the next review, and whether the scenario is ready to advance
Reflections can be skipped, but the insights are where most of the review’s value comes from. The sidebar flags sessions that still need one.

Know when to stop

A scenario is stable when additional reviews stop producing new insights. Two or three reviews that surface the same patterns and no new gaps are a good signal that the design has settled. At that point, stop reviewing and move the scenario into cohorts.
Run reviews with more than one expert when you can. Different experts take different valid approaches, and comparing them separates the common patterns from individual habits.

Measure at scale: capture cohorts

A capture cohort is a batch of scenario assignments: a named group of stable scenarios assigned to a named group of operators, with progress tracking and automated email built in.

Creating a cohort

A 3-step wizard sets up the whole cohort in one action:
1

Name the cohort

Give it a name and an optional description.
2

Select stable scenarios

Only stable scenarios are eligible. Each scenario’s current version is snapshotted at this moment, so later edits to the scenario never change what an in-flight cohort is measuring.
3

Assign operators

Select operators from your roster. Assignment is all-to-all: every selected operator gets every selected scenario. Welcome emails go out automatically when the cohort is created.

Tracking progress

Each assignment moves through four statuses as the operator progresses: pending (created, welcome email sent), link clicked (the operator opened the email link), in progress (the session has started), and completed. Assignments also count attempts when an operator runs a scenario more than once. The cohort detail page rolls sessions up into a funnel, invited, opened, in progress, completed, so you can see at a glance where operators are dropping off. Per-operator cards break this down to individual assignments, sessions, and email delivery history.

Cohort statuses

StatusMeaning
DraftSetup. You can still add or remove scenarios and operators. A cohort can only be deleted in this status.
ActiveThe cohort is live and in use.
CompletedAn admin has marked it done.
ArchivedHidden from the active cohorts view.
All transitions are manual. A cohort does not mark itself completed when the last session finishes; an admin makes that call.

Automated emails

Cohorts send four email types, each with delivery tracking (sent, delivered, opened, clicked, bounced):
  • Welcome - sent automatically on cohort creation, with links to the operator’s assigned sessions
  • Session started - sent when an admin starts a session on an operator’s behalf
  • Reminder - sent manually by an admin to nudge an incomplete session
  • Completion - sent automatically when an operator finishes their sessions
If operators seem unresponsive, check delivery status before assuming they are ignoring you. A bounced welcome email looks identical to silence from your side of the dashboard.

Triage alerts

The dashboard flags three risk conditions so you do not have to hunt for them:
AlertThreshold
Stale cohortsActive cohorts with no session completions for 7+ days
Stuck operatorsSessions opened but not started for 5+ days
At-risk sessionsSessions in progress for 14+ days
A quick reminder email recovers most stuck operators. Check these alerts regularly while a cohort is active.

What operators see

Operators complete their work in the field app, a focused interface separate from the admin dashboard. An operator clicks the link in their email, authenticates, and lands on a dashboard showing their assigned scenarios grouped by cohort. They click into a scenario to start the session, and progress is tracked automatically from there. No setup or training on the tool is required on their end.

What happens to the sessions

Every completed session is scored against the criteria the scenario defines, the same criteria for every operator and for any AI agent that runs the same scenario. Reports show which criteria were met and why. See defining good for how scenarios specify criteria, and Results for reading the scores.

Run scenarios with your AI

Put an AI agent in the same operator seat and compare directly.

Scenario types

Decision tasks, conversation scenarios, and journey simulations.