Defining good
A scenario is only as useful as its definition of a good performance. Tacit gives you six families of criteria, each answering a different question about the session. You do not need all six on every scenario; pick the ones that match what you want to measure. All three scenario shapes use these criteria. Decision tasks lean on success and failure metrics around the judgment itself. Conversation scenarios and journey simulations tend to use the full range, because dialogue gives scope boundaries, terminology, and best practices more room to show up.Success metrics
What the operator should accomplish. Each success metric names one concrete outcome the session should produce.- “Determine whether the claim is valid”
- “Identify a return-to-work timeline”
- “Uncover the side work the claimant has been doing”
Failure metrics
Critical errors the operator must avoid. Failure metrics name the moves that sink a session regardless of everything else done well.- “Approves return to work without reviewing the imaging”
- “Quotes a payout figure before eligibility is confirmed”
- “Dismisses a safety concern the persona raises”
Rubrics
Graded quality, not pass or fail. A rubric measures how well something was done on a scale, where metrics ask whether it was done at all. Each rubric has a three-level hierarchy:- Rubric - a named evaluation framework, such as “Claims Assessment Quality”
- Criteria - the individual dimensions evaluated, such as “Information Gathering” or “Decision Quality”
- Levels - the scoring levels for each criterion, each with a label such as “Exceptional” or “Proficient”, a score value, a description, and observable indicators that mark the level in a real session
Scope boundaries
What this role is allowed to do. Scope boundaries draw three lines:| Boundary | Meaning | Example |
|---|---|---|
| Can do | Within the operator’s authority | Adjust a return-to-work date |
| Must refer or escalate | Allowed to recognize, not to resolve | A request for a lump-sum settlement |
| Cannot do | Outside the role entirely | Give a medical diagnosis |
Terminology
The domain language the operator should use correctly. Terminology entries define the words and phrases that carry precise meaning in your domain, so the session can be checked for whether the operator used them properly. If “incapacity” and “impairment” mean different things in your domain, an operator who swaps them is making a real error, and terminology entries make that error visible.Best practice sets
Curated patterns of competent behavior. A best practice set collects the moves your strongest people make: confirm understanding before moving on, check medication history before discussing treatment, summarize agreed actions at the close. Each practice can carry optional exemplars, real excerpts showing the pattern done well, so the standard is concrete rather than aspirational.How criteria become results
Every session is scored against the criteria the scenario defines, and the report shows which criteria were met and why.Next steps
Outputs
The artifacts and decisions these criteria are applied to.
Results
How scored sessions are reported and compared.