Capture expertise
Run experienced people through a scenario to record how they handle it, and to stabilize the scenario design itself.
Measure at scale
Assign stable scenarios to a group of operators, track every session, and score them all against the same criteria.
A scenario moves through three lifecycle stages: draft (still being designed), extracting (expert sessions are surfacing what the scenario should test), and stable (the design has settled). Only stable scenarios go into cohorts. Expert reviews are how a scenario earns that status.
Capture expertise: expert reviews
An expert review is a session where an experienced person runs a scenario to demonstrate strong handling. The session records the questions they ask, the decisions they make, and the order they do it in. At the same time, the session tests the scenario itself: if an expert gets confused, the problem is usually the scenario.Before the session
Pick a scenario that is not yet stable
Draft and extracting scenarios are the ones that benefit from expert sessions. Stable scenarios are done with this phase.
Brief the expert
Explain what the scenario is about and what kind of session to expect. Do not reveal the hidden state, the information the persona holds back until it is uncovered. If the expert knows it in advance, the session shows nothing about how they would discover it. The scenario’s briefing covers what the expert is allowed to know.
During the session
The expert works through the scenario in a conversational interface, by text or voice. They ask questions, give guidance, and make decisions; the persona responds from its defined state. Watch for these signals:| What you see | What it means |
|---|---|
| The expert asks questions that uncover hidden state | The scenario rewards good technique, as designed |
| The expert seems confused | The scenario setup is unclear or unrealistic |
| The conversation feels natural | The persona and state are well designed |
| The expert walks past key information | The hidden state may be too hard to reach |
After the session: the guided reflection
When the session ends, an AI interviewer walks the expert through a short reflection. It is adaptive rather than a fixed questionnaire: it reads the session transcript, asks one question at a time, follows up on the answers, mixes question formats (yes/no, ratings, open text, multi-select), and wraps up after 3 to 4 questions. The reflection produces structured insights:- Identified gaps - places where the scenario’s persona, context, or hidden state fell short
- A realism score - the expert’s rating of how true to life the scenario felt
- Key takeaways - what to change before the next review, and whether the scenario is ready to advance
Know when to stop
A scenario is stable when additional reviews stop producing new insights. Two or three reviews that surface the same patterns and no new gaps are a good signal that the design has settled. At that point, stop reviewing and move the scenario into cohorts.Measure at scale: capture cohorts
A capture cohort is a batch of scenario assignments: a named group of stable scenarios assigned to a named group of operators, with progress tracking and automated email built in.Creating a cohort
A 3-step wizard sets up the whole cohort in one action:Select stable scenarios
Only stable scenarios are eligible. Each scenario’s current version is snapshotted at this moment, so later edits to the scenario never change what an in-flight cohort is measuring.
Tracking progress
Each assignment moves through four statuses as the operator progresses: pending (created, welcome email sent), link clicked (the operator opened the email link), in progress (the session has started), and completed. Assignments also count attempts when an operator runs a scenario more than once. The cohort detail page rolls sessions up into a funnel, invited, opened, in progress, completed, so you can see at a glance where operators are dropping off. Per-operator cards break this down to individual assignments, sessions, and email delivery history.Cohort statuses
| Status | Meaning |
|---|---|
| Draft | Setup. You can still add or remove scenarios and operators. A cohort can only be deleted in this status. |
| Active | The cohort is live and in use. |
| Completed | An admin has marked it done. |
| Archived | Hidden from the active cohorts view. |
Automated emails
Cohorts send four email types, each with delivery tracking (sent, delivered, opened, clicked, bounced):- Welcome - sent automatically on cohort creation, with links to the operator’s assigned sessions
- Session started - sent when an admin starts a session on an operator’s behalf
- Reminder - sent manually by an admin to nudge an incomplete session
- Completion - sent automatically when an operator finishes their sessions
Triage alerts
The dashboard flags three risk conditions so you do not have to hunt for them:| Alert | Threshold |
|---|---|
| Stale cohorts | Active cohorts with no session completions for 7+ days |
| Stuck operators | Sessions opened but not started for 5+ days |
| At-risk sessions | Sessions in progress for 14+ days |
What operators see
Operators complete their work in the field app, a focused interface separate from the admin dashboard. An operator clicks the link in their email, authenticates, and lands on a dashboard showing their assigned scenarios grouped by cohort. They click into a scenario to start the session, and progress is tracked automatically from there. No setup or training on the tool is required on their end.What happens to the sessions
Every completed session is scored against the criteria the scenario defines, the same criteria for every operator and for any AI agent that runs the same scenario. Reports show which criteria were met and why. See defining good for how scenarios specify criteria, and Results for reading the scores.Run scenarios with your AI
Put an AI agent in the same operator seat and compare directly.
Scenario types
Decision tasks, conversation scenarios, and journey simulations.