Journey Simulations

A journey simulation takes a conversation scenario and adds the dimension single sessions cannot reach: time. Instead of one session, the operator works with the same persona across a sequence of sessions spread over simulated weeks or months. Between sessions, the situation moves, and it moves in response to what the operator did. This is the shape for work where quality only shows up longitudinally. A return-to-work plan, a debt arrangement, a treatment pathway, a customer onboarding: each looks fine at the first meeting. Whether it was actually good is visible at week six.

What time adds

Two things change relative to a single conversation:
  1. State evolves. State items in a journey can carry a recording frequency: daily, weekly, bi-weekly, monthly, or on-request. As simulated time advances, those facts update. Pain levels shift, payments are made or missed, the situation at home changes.
  2. The evolution is conditional. The state at session three is not a script; it depends on what the operator did at sessions one and two. A plan the client never agreed to gets quietly abandoned. A referral made early changes what the client reports later. The persona remembers, the situation compounds, and so do mistakes.
The result is that two operators finish the same journey in genuinely different places, which is exactly what the scoring needs to see.

Time progression

You control how simulated time advances between sessions:
ModeBehavior
ManualAn admin advances time explicitly
PromptedThe operator is offered the advance and confirms it
AutopilotTime advances on its own schedule
A journey also defines its overall span and the default amount of time each advance covers, so a “12 weeks of fortnightly check-ins” structure is configuration, not convention.

What you configure

Everything a conversation scenario has, plus:
ComponentRole in a journey
Temporal configurationProgression mode, default time per advance, total duration
State with recording frequenciesWhich facts move with time, and how often
Per-session outputsArtifacts and decisions expected at specific points: an initial plan, a mid-journey review, a closing summary
CriteriaIncluding the longitudinal ones: was the plan followed up, were early warning signs acted on before they compounded

Worked example

An income support agency wants to know whether case managers keep clients engaged across a 12-week program, not just at intake.
  • Persona: a client recently out of work, cooperative at intake, with motivation that decays unless engagement is maintained
  • Journey: six fortnightly check-ins over 12 simulated weeks
  • Evolving state: job-search activity (weekly recording), mood and motivation (per session), a housing pressure that emerges around week 6 only if earlier sessions missed the financial strain behind it
  • Outputs: an initial action plan at session one, a revised plan at session four, a program summary at session six
  • Success metric: the week-6 housing pressure is anticipated, because the operator surfaced the financial strain by week 4
  • Failure metric: a check-in that repeats the previous session’s plan without acknowledging what changed since
An operator who treats every check-in as a fresh conversation scores well on politeness and poorly on the journey. The one who carries the thread, notices the decay early, and adapts the plan is distinguishable in the report, which is the entire reason to simulate the weeks instead of asking about them.

Running it

Journeys run with your people and with AI agents like any other scenario. For agents, journeys are the most demanding benchmark the platform offers: they test whether an AI can hold context, follow through, and manage a relationship rather than win a single exchange.