Home - mirror/harness-skills-plugin

mirror of https://github.com/woink/harness-skills-plugin.git synced 2026-04-30 09:10:43 -07:00

Table of Contents

Trigger Phrases
Phase Flow
config.yaml Keys
.harness/ Files
run-harness.sh Flags
Cost Reference

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

id	title	sidebar_label	last_generated
cheatsheet	Cheatsheet	⚡ Cheatsheet	2026-04-14

Trigger Phrases

Skill	Trigger Phrases
`harness-orchestrator`	"start a new project", "spin up the harness", "bootstrap a project", "run the harness", "enhance this project", "refactor this codebase", "fix up this app", user provides a full-project prompt
`harness-orchestrator` (resume)	"resume the harness", "continue the build"
`harness-auditor`	called by orchestrator (existing projects); or "re-run the auditor"
`harness-planner`	called by orchestrator; or "re-run the planner", "regenerate a spec"
`harness-builder`	called by orchestrator; or "re-run QA on the current sprint"
`harness-evaluator`	called by orchestrator; or "re-run QA on an existing build"
`harness-brief`	standalone brief generation

Phase Flow

GREENFIELD                              EXISTING PROJECT
─────────────────────────────────────   ──────────────────────────────
Phase 0  Elicit brief                   Phase 0  Elicit brief
           → PROJECT_BRIEF.md                      → PROJECT_BRIEF.md
Phase 1  Init .harness/                 Phase 1A Auditor
           → config.yaml                           → CODEBASE_AUDIT.md
                                                   [GATE: review audit]
Phase 2  Planner                        Phase 2  Planner (delta mode)
           → PRODUCT_SPEC.md                       → PRODUCT_SPEC.md
           [GATE: approve spec]                    [GATE: approve spec]
                ↓ spec approved: touch .graduate, autopilot.enabled: true
Phase 3  Builder (per sprint)           Phase 3  Builder (Sprint 0: housekeeping)
           SPRINT_CONTRACT.md                      then feature sprints
           → code + evaluator                      → code + evaluator
           → QA_REPORT.md                          → QA_REPORT.md
           (fail? retry, max 2)                    (fail? retry, max 2)
Phase 4  Polish (up to 15 iterations)
           Builder refinements → Evaluator → POLISH_REPORT.md
           → append POLISH_HISTORY.md
           (plateau? pivot conditions)
           (all ≥ thresholds → Phase 5)
Phase 5  Final QA Gate → QA_REPORT.md
Phase 6  Completion → HANDOFF.md
           [GATE: user accepts or requests changes]
           (runner mode: write COMPLETE, exit)

config.yaml Keys

Key	Type	Default	Description
`harness_version`	int	`1`	Schema version
`project_mode`	enum	—	`greenfield` \| `enhance` \| `refactor` \| `rescue`
`eval_mode`	enum	—	`playwright` \| `automated-tests` \| `manual-gates` \| `hybrid`
`stack`	string	`pending`	Filled by planner or auditor
`max_sprint_retries`	int	`2`	Builder/evaluator loops per failed sprint
`qa_pass_threshold.functionality`	int	`7`	Sprint pass floor
`qa_pass_threshold.design_quality`	int	`6`	Sprint pass floor
`qa_pass_threshold.code_quality`	int	`6`	Sprint pass floor
`qa_pass_threshold.product_depth`	int	`6`	Sprint pass floor
`unattended`	bool	`false`	Set by `run-harness.sh`; skips all manual gates
`manual_gates.after_audit`	bool	`true`	Pause for user after audit (existing projects)
`manual_gates.after_spec`	bool	`true`	Pause for user after spec
`manual_gates.after_each_sprint`	bool	`false`	Pause for user after each sprint
`manual_gates.after_final_qa`	bool	`true`	Pause for user after final QA
`polish.enabled`	bool	`true`	Enable iterative polish phase
`polish.max_iterations`	int	`15`	Maximum polish iterations
`polish.plateau_window`	int	`3`	Iterations to average for plateau detection
`polish.plateau_threshold`	float	`0.5`	Avg improvement/iter below this = plateau
`polish.pivot_after_stagnant`	int	`3`	Stagnant iterations before pivot eligible
`polish.context`	enum	`pending`	`frontend` \| `backend` \| `fullstack` — set by orchestrator
`polish.base_thresholds.*`	int	same as qa	Starting thresholds for polish
`polish.threshold_step`	int	`1`	Threshold increase per interval
`polish.threshold_interval`	int	`4`	Iterations between threshold raises
`polish.threshold_overrides`	map	`{}`	Per-criterion step override, e.g. `{design_quality: 2}`
`autopilot.enabled`	bool	`false`	Set when `.graduate` is written; skips gates
`autopilot.design_question_timeout`	int	`1800`	Seconds before auto-default is applied (30 min)
`autopilot.interactive_step_limit`	int	`4`	Major steps before checkpointing
`autopilot.graduation_trigger`	string	`spec_approved`	Event that writes `.graduate`

.harness/ Files

File	Writer	Reader	Purpose
`config.yaml`	Orchestrator + Planner + Auditor	All	Settings, thresholds, mode
`PROJECT_BRIEF.md`	Orchestrator	Auditor, Planner	User intent, constraints
`HANDOFF.md`	Orchestrator	Orchestrator (resume)	State: last done, next steps, polish state
`PROGRESS.md`	Orchestrator (append)	Orchestrator (reconcile)	`start`/`done` trail for interrupted-step detection
`CODEBASE_AUDIT.md`	Auditor	Planner, Builder, Evaluator	Existing codebase analysis
`PRODUCT_SPEC.md`	Planner	Builder, Evaluator	What to build
`CALIBRATION.md`	Planner / user	Evaluator	Project-specific score anchors
`SPRINT_CONTRACT.md`	Builder	Evaluator	This sprint's criteria and builder notes
`QA_REPORT.md`	Evaluator	Builder (retry), Orchestrator	Sprint scores and feedback
`POLISH_REPORT.md`	Evaluator (polish)	Builder (polish), Orchestrator	Sub-criteria scores, pivot suggestion
`POLISH_HISTORY.md`	Orchestrator (append)	Evaluator (drift)	Score history across polish iterations
`DESIGN_QUESTION.md`	Orchestrator (autopilot)	User, runner	Product/taste decision with options + default
`DESIGN_ANSWER.md`	User / runner (auto)	Orchestrator	Answer to design question
`COMPLETE`	Orchestrator	`run-harness.sh`	Signals successful completion
`STOP`	User	`run-harness.sh`	Stop runner after current session
`.runner`	`run-harness.sh`	Orchestrator	Runner mode marker
`runner.log`	Runner + orchestrator	Developer	Session activity log

run-harness.sh Flags

Flag	Default	Description
`--max-iterations N`	`30`	Max sessions before stopping
`--verbose`, `-v`	off	Stream Claude output to terminal
`--watch`, `-w`	—	Colorized live tail of runner.log
`--help`, `-h`	—	Show help

Exit codes: 0 = complete, 1 = max iterations hit, 2 = stopped by user, 3 = claude not found

Stop: touch .harness/STOP or Ctrl+C

Resume: re-run the same command — picks up from HANDOFF.md

Cost Reference

Phase	Cost	Duration
Auditor	$1–5	5–15 min
Planner	$0.50–2	3–10 min
Builder (per sprint)	$20–70	30–120 min
Evaluator (per sprint)	$3–5	5–10 min
Polish (per iteration)	$10–25	20–45 min
5-sprint app, 1 retry	$100–200	8–16 h