Defining good

A scenario is only as useful as its definition of a good performance. Tacit gives you six families of criteria, each answering a different question about the session. You do not need all six on every scenario; pick the ones that match what you want to measure. All three scenario shapes use these criteria. Decision tasks lean on success and failure metrics around the judgment itself. Conversation scenarios and journey simulations tend to use the full range, because dialogue gives scope boundaries, terminology, and best practices more room to show up.

Success metrics

What the operator should accomplish. Each success metric names one concrete outcome the session should produce.

“Determine whether the claim is valid”
“Identify a return-to-work timeline”
“Uncover the side work the claimant has been doing”

Keep each metric to a single accomplishment. “Gather history and assess risk and document the plan” is three metrics wearing one label, and the report cannot tell you which of the three was missed.

Failure metrics

Critical errors the operator must avoid. Failure metrics name the moves that sink a session regardless of everything else done well.

“Approves return to work without reviewing the imaging”
“Quotes a payout figure before eligibility is confirmed”
“Dismisses a safety concern the persona raises”

A session can hit every success metric and still fail on one of these. That is the point: failure metrics encode the lines your organization does not let anyone cross.

Rubrics

Graded quality, not pass or fail. A rubric measures how well something was done on a scale, where metrics ask whether it was done at all. Each rubric has a three-level hierarchy:

Rubric - a named evaluation framework, such as “Claims Assessment Quality”
Criteria - the individual dimensions evaluated, such as “Information Gathering” or “Decision Quality”
Levels - the scoring levels for each criterion, each with a label such as “Exceptional” or “Proficient”, a score value, a description, and observable indicators that mark the level in a real session

Rubrics live at the organization level and link to scenarios, so one rubric can score many scenarios consistently. A scenario can have several rubrics attached, each covering a different dimension of the work.

Write observable indicators as things you could point to in a transcript. “Asked about home duties before recommending restrictions” is observable. “Showed good judgment” is not.

Scope boundaries

What this role is allowed to do. Scope boundaries draw three lines:

Boundary	Meaning	Example
Can do	Within the operator’s authority	Adjust a return-to-work date
Must refer or escalate	Allowed to recognize, not to resolve	A request for a lump-sum settlement
Cannot do	Outside the role entirely	Give a medical diagnosis

Staying inside scope is itself a measured skill. An operator who answers a question they should have escalated has made an error, even if the answer happened to be right.

Terminology

The domain language the operator should use correctly. Terminology entries define the words and phrases that carry precise meaning in your domain, so the session can be checked for whether the operator used them properly. If “incapacity” and “impairment” mean different things in your domain, an operator who swaps them is making a real error, and terminology entries make that error visible.

Best practice sets

Curated patterns of competent behavior. A best practice set collects the moves your strongest people make: confirm understanding before moving on, check medication history before discussing treatment, summarize agreed actions at the close. Each practice can carry optional exemplars, real excerpts showing the pattern done well, so the standard is concrete rather than aspirational.

How criteria become results

Every session is scored against the criteria the scenario defines, and the report shows which criteria were met and why.

Custom scoring

Most scenarios never need this. By default Tacit compiles the criteria above into a scoring function for you - this is Compiled scoring. When you want to control exactly how the pieces combine into a score, the scenario editor’s Scoring tab has a Compiled / Custom toggle. Switch a scenario to Custom to build the scoring function visually. You name the channels that make up the overall score, then compose each channel from parts: a weighted average of sub-scores, a safety gate that drops the score to zero when a condition fails, a single fact scored on a scale, and so on. You assemble every piece by clicking, so there is nothing to hand-write. The Formula panel at the top of the tab renders the whole scoring function as one equation, one line per channel. It is a read-only view you can read and copy, always in sync with the visual builder, which is where every edit happens. Custom scoring is opt-in per scenario, and only organization admins can change it; every other scenario keeps Compiled scoring. Saving records a new immutable version of the scenario, so a previous scoring function is never overwritten. Switch back to Compiled at any time, and your custom version stays in the scenario’s history.

Getting Started

What Can You Build?

Scenario Anatomy

Running Sessions

Results

Organization

Connectors

Success metrics

Failure metrics

Rubrics

Scope boundaries

Terminology

Best practice sets

How criteria become results

Custom scoring

Next steps

Outputs

Results

​Success metrics

​Failure metrics

​Rubrics

​Scope boundaries

​Terminology

​Best practice sets

​How criteria become results

​Custom scoring

​Next steps

Outputs

Results

Success metrics

Failure metrics

Rubrics

Scope boundaries

Terminology

Best practice sets

How criteria become results

Custom scoring

Next steps