Improve Skill

Input: $ARGUMENTS

Core Principles

Read before diagnosing. Never improve a skill you haven’t read in full. The most common failure is proposing generic improvements to a skill whose actual content you don’t understand. Read every line before generating a single critique.
The quality standard is objective, not aesthetic. A skill is good or bad relative to specific structural requirements: core principles, failure modes table, depth scaling table, pre-completion checklist, integration section. Missing elements are defects, not style choices.
Content quality is domain-specific. Generic principles (“be thorough”, “consider alternatives”) indicate a skill that doesn’t understand its own domain. Good principles name specific failure modes and specific interventions that only apply to THIS skill’s problem space.
Improvement means closing gaps, not rewriting from scratch. If a skill has strong content but missing structure, add the structure. If it has good structure but generic content, sharpen the content. Identify what’s weak and fix THAT — don’t destroy what works.
The exemplar skills define the floor. Skills like /foht (300 lines), /sbfow (279 lines), /iterate (373 lines), /araw (410 lines) represent the quality standard. Any improved skill must be comparable in depth, specificity, and structural completeness.

Phase 1: Read and Baseline

Read the target skill completely.

[I1] SKILL: /[name]
[I2] CURRENT_LINES: [line count]
[I3] CURRENT_STRUCTURE: [list which standard elements exist and which are missing]

Structural Audit

Check for each required element:

Element	Present?	Quality
Frontmatter (name, description)
Core Principles (3-6, domain-specific, non-generic)
Multi-phase structure with lettered findings
Failure Modes table (Failure / Signal / Fix)
Depth Scaling table (1x/2x/4x/8x with floors)
Pre-Completion Checklist (6+ binary items)
Integration section (use from, routes to, differs from)

Phase 2: Content Diagnosis

Evaluate content quality beyond structure.

[I4] PRINCIPLES_QUALITY: [are principles domain-specific or generic platitudes?]
     GENERIC_PRINCIPLES: [list any that could apply to ANY skill — these need replacement]
     MISSING_INSIGHTS: [what domain-specific knowledge is absent?]

[I5] PHASE_QUALITY: [do phases have concrete steps or vague instructions?]
     VAGUE_PHASES: [list any that say "analyze" or "consider" without specifying HOW]
     MISSING_PHASES: [what steps are implied but not explicit?]

[I6] FAILURE_MODES_QUALITY: [are failure modes specific to this skill's domain?]
     GENERIC_FAILURES: [list any that apply to everything — "didn't think hard enough"]
     MISSING_FAILURES: [what domain-specific failures are unaddressed?]

[I7] OUTPUT_QUALITY: [is the output format specific and useful?]
     MISSING_OUTPUT: [what should the output include that it doesn't?]

Content Quality Tests

Test	Pass Criteria
Disagreement test	Could a thoughtful person disagree with any principle? If not, it’s too generic
Domain test	Would removing the skill name make the principles unrecognizable? They should be identifiable
Specificity test	Do phases say WHAT to do, not just “analyze this”?
Failure test	Are failure modes things that actually go wrong with THIS type of task?
Actionability test	Could someone follow this skill and produce output without improvising?

Phase 3: Improvement Plan

Generate specific fixes, not vague suggestions.

[I-N] FIX: [specific change]
     TYPE: [structural | content | integration | clarity]
     PRIORITY: [critical | important | nice-to-have]
     CURRENT: [what exists now — quote or describe]
     PROPOSED: [what should replace it — be specific]
     RATIONALE: [why this improves the skill]

Fix Priority Rules

Critical: Missing required structural element (no failure modes table, no depth scaling)
Critical: Generic principles that could apply to any skill
Important: Vague phases that don’t specify concrete steps
Important: Missing domain-specific failure modes
Nice-to-have: Formatting improvements, better examples, additional integration links

Phase 4: Apply Fixes

Write the improved skill. Apply all critical and important fixes. Apply nice-to-haves if they don’t bloat.

Application Rules

Preserve existing content that’s already good — don’t rewrite what works
Add missing structural elements in their standard positions
Replace generic principles with domain-specific ones
Ensure the improved skill is 150-300 lines (the quality range)
Verify every phase has concrete, followable steps

Phase 5: Verify

Re-run the structural audit and content diagnosis on the improved version.

[I-N] VERIFICATION:
     STRUCTURAL_GAPS_REMAINING: [should be zero]
     GENERIC_CONTENT_REMAINING: [should be zero]
     LINE_COUNT: [should be 150-300]
     ALL_TESTS_PASS: [disagreement, domain, specificity, failure, actionability]

Failure Modes

Failure	Signal	Fix
Full rewrite of good content	Existing strong sections get replaced with generic versions	Read carefully. Preserve what works. Only fix what’s broken
Generic improvements	”Made the principles more thorough” without specifics	Every fix must be a specific change with current/proposed
Structure without content	Added all tables and sections but they’re empty or generic	Structure is necessary but not sufficient. Content must be domain-specific
Bloat	Skill went from 20 lines to 400 lines of filler	Quality range is 150-300 lines. Density matters more than length
Lost voice	Original skill had a distinctive approach that was smoothed away	Identify what’s distinctive about the skill and preserve it
Improvement without diagnosis	Jumped to rewriting without identifying what’s actually wrong	Always diagnose first. The fix must match the actual problem
Ceremonial checklist	Pre-completion checklist items that are always true or unverifiable	Each item must be binary and falsifiable

Depth Scaling

Depth	Diagnosis	Fixes	Verification
1x	Structural audit only	Critical fixes only	Quick check
2x	Structural + content diagnosis	Critical + important	Full re-audit
4x	Full diagnosis + comparison to exemplars	All fixes + domain research	Full + exemplar comparison
8x	Full + user testing simulation	All + iterative refinement	Full + adversarial review

Default: 2x. These are floors.

Pre-Completion Checklist

Target skill read in full before any diagnosis
Structural audit completed with all elements checked
Content diagnosis identifies specific (not generic) issues
Every fix has current/proposed/rationale
Critical fixes all applied
Improved skill has all required structural elements
Improved skill principles pass the domain test
Line count is in the 150-300 range
Verification confirms zero structural gaps

Integration

Use from: skill maintenance, quality audits, skill expansion tasks
Routes to: /w (if skill needs actual prose improvement), /cs (if skill needs structural rework)
Complementary: /impss (improves multiple skills), /imprt (auto-identifies what to improve)
Differs from /cs: cs creates new skills; imps improves existing ones
Differs from /impss: impss handles batches; imps handles one skill deeply
Differs from /fmtsb: fmtsb formalizes drafts into production; imps improves already-production skills

imps - Improve Skill