No test data → No confidence claim → Mark as [UNTESTED]
Default Assumptions
Old behavior: Use defaults when actual value unknown
New behavior: 1. Can we derive actual value? → Derive it
2. Can we test which value is appropriate? → Test it
3. If neither → Mark as [UNKNOWN VALUE, using default X for Y reason]
Default is documented as DEFAULT, not as verified value
Expert Fill In
Old behavior: Allow “expert can figure it out”
New behavior: 1. Have expert demonstrate gap-filling on test case
2. Document what expert did to fill gap
3. Only then mark as adequate
“Expert can probably figure it out” → [UNVERIFIED]
“Expert demonstrated filling gap by X” → [T: expert test]
Verification
Every claim in output has verification marker
Every marker has documentation
No “flagged for review” without resolution
No “low confidence” without actual test data
Unknown items explicitly marked as unknown
Defaults explicitly marked as defaults
Output Format
Every output using this procedure has:1. VERIFIED CLAIMS section - Each claim with [O], [T], or [D] marker - Each marker with documentation2. UNKNOWN section (if any) - Honest acknowledgment of what we don't know - No pretense of knowledge3. DEFAULTS section (if any) - Defaults used with explicit justification - Marked clearly as defaults, not verified values4. EXCLUDED section (optional, for transparency) - What was excluded due to unverifiability - Why it couldn't be verified
Step 7: Specificity Check for Capability Claims
For any claim about what a system DOES or SHOULD DO:
Apply the specificity gate:
Does it specify TRIGGER? (what causes this to happen)
Does it specify PROCEDURE? (what exact steps occur)
Does it specify OUTPUT? (what concrete result is produced)
Does it specify VALIDATION? (how do we know it worked)
If any element is missing:
BLOCK the claim
Generate questions for missing elements
Only include when all 4 are specified
→ INVOKE: /specificity_gate [capability claim]
Output: Capability claims with all 4 elements specified