ECL Harness Engineer
Design and create Harness Engineering infrastructure so AI agents can work reliably in a codebase.
Core Philosophy: "Intelligence without infrastructure is just a demo." The Agent Harness is the Operating System — the LLM is just the CPU. The repository becomes the single source of truth — if an agent can't see it in context, it doesn't exist.
When to Use This Skill
- Use when a repository needs AI-agent collaboration infrastructure such as
AGENTS.md,docs/ECL.md,docs/STATUS.md, harness change tracking, or mechanical validation gates. - Use when auditing an existing Agent Harness for missing ECL lifecycle docs, change templates, lint checks, environment contracts, or CI integration.
- Use when converting repeated agent workflow failures into repository-local documentation, tests, lint rules, or lightweight auto-evolution checks.
- Do not use for ordinary business feature implementation unless the requested work is specifically about creating or improving the repository harness.
Limitations
- This skill creates or audits harness infrastructure; it does not replace product requirements, implementation planning, code review, or release approval for the target project.
- The generated ECL docs, linters, scripts, and CI examples must be adapted to the repository's actual stack, security model, and existing contributor workflow before enforcement.
- Auto-evolve recommendations are guidance only. Apply harness changes through normal review, validation, and rollback discipline instead of accepting them as autonomous policy changes.
Unified Workflow
This skill follows a single unified workflow regardless of project state (empty, existing code, or existing harness). The core idea: detect the gap between current state and target state, then fill it.
Default to a core ECL harness. Core includes lightweight auto-evolve threshold checking: closed changes are counted, a pending evolution note is generated when the threshold is reached, and Codex applies harness improvements only through evidence, validation, scoring, and rollback. Advanced agent-platform capabilities such as eval datasets, execution traces, durable state, checkpoints, long-term memory, and metrics remain optional profiles only when the user explicitly asks for agent evaluation, observability, resumable execution, or long-term memory.
This skill improves the target repository's agent harness. It does not implement ordinary business features, replace the coding agent's plan mode, or create a separate requirements product. Plan mode is useful for live discussion; ECL artifacts are the repository record that later agents, linters, CI, and archive history can inspect.
- Quick Detection + Intent Confirmation — what exists, what already passes, and what the user wants.
- Analysis — architecture, harness state, environment, and project identity.
- Intake Review + Delta Synthesis — classify small vs structured work, support requirement-first and plan-first inputs, and compute exactly what to create or update.
- Creation/Update — docs, status handoff, linters, ECL/change scripts, environment config, and CI.
- Verification + Handoff — run checks, attribute failures, update STATUS.md, trigger auto-evolve checks, and summarize results.
Phase 1: Quick Detection + Intent Confirmation
Goal: In under 5 minutes, understand project state and user intent.
1.1 Project State Detection
Run this quick scan:
# Count files
file_count=$(find . -type f ! -path './.git/*' ! -path './node_modules/*' ! -path './vendor/*' 2>/dev/null | wc -l)
code_files=$(find . -type f \( -name "*.go" -o -name "*.ts" -o -name "*.js" -o -name "*.py" -o -name "*.rs" \) ! -path './.git/*' ! -path './node_modules/*' ! -path './vendor/*' 2>/dev/null | wc -l)
# Check harness components
has_agents_md=$(test -f AGENTS.md && echo "yes" || echo "no")
has_architecture=$(test -f docs/ARCHITECTURE.md && echo "yes" || echo "no")
has_linters=$(ls scripts/lint-* 2>/dev/null | wc -l)
has_harness_dir=$(test -d harness && echo "yes" || echo "no")
has_ecl_doc=$(test -f docs/ECL.md && echo "yes" || echo "no")
has_changes_dir=$(test -d harness/changes && echo "yes" || echo "no")
has_change_templates=$(test -d harness/templates/change && echo "yes" || echo "no")
has_change_script=$(ls scripts/harness-change.* 2>/dev/null | wc -l)
has_evolve_script=$(ls scripts/harness-evolve.* 2>/dev/null | wc -l)
has_ecl_lint=$(ls scripts/lint-ecl.* 2>/dev/null | wc -l)
has_encoding_lint=$(ls scripts/lint-encoding.* 2>/dev/null | wc -l)
has_makefile=$(test -f Makefile && echo "yes" || echo "no")
has_package_json=$(test -f package.json && echo "yes" || echo "no")
# Detect tech stack
if test -f go.mod; then TECH="Go"
elif test -f package.json; then TECH="TypeScript/Node.js"
elif test -f requirements.txt || test -f pyproject.toml; then TECH="Python"
else TECH="Unknown"
fi
1.2 Classify Project State
Based on detection:
| State | Criteria | Action |
|---|---|---|
| Empty | file_count < 5 AND code_files = 0 | Guide user through project choices first |
| Code Only | code_files > 0 AND has_agents_md = "no" | Full analysis + core harness creation |
| Partial Harness | has_agents_md = "yes" AND (has_linters = 0 OR has_harness_dir = "no") | Gap analysis + fill gaps |
| Harness Present | Core harness components exist | Audit + improvement suggestions |
Also classify ECL readiness:
| ECL State | Criteria | Action |
|---|---|---|
| ECL Missing | has_ecl_doc = "no" OR has_changes_dir = "no" | Create ECL docs, change templates, and scripts |
| ECL Partial | ECL doc exists but scripts/templates missing | Fill ECL automation gaps |
| ECL Ready | docs/ECL.md, harness/changes, templates, harness-change, harness-evolve, lint-ecl, lint-encoding exist | Audit index freshness and workflow quality |
1.3 Baseline Verification Snapshot
For existing projects, capture a best-effort baseline before creating or updating harness files. The baseline is for attribution only: it distinguishes pre-existing project failures from failures introduced by harness work. It must not be used to weaken default CI.
Run only commands that already exist in the project:
| Ecosystem | Baseline commands |
|---|---|
| TypeScript/Node.js | package scripts such as lint, typecheck, test, build; include nested package build scripts when detected |
| Go | go test ./..., go build ./..., existing make lint or make test |
| Python | existing test/lint scripts, python -m compileall . |
Record each command as pass, fail, or missing, with the short failure reason. If a command
fails before harness creation, report it later as pre-existing project debt, not as harness
failure. Default CI remains strict and should still include normal business gates unless the user
explicitly asks for a temporary staged rollout.
1.4 Intent Confirmation
Before planning changes, classify requested scope:
| Scope | Default? | Includes |
|---|---|---|
| Core harness | Yes | AGENTS.md, docs/ECL.md, docs/STATUS.md, docs, ECL changes, lightweight auto-evolve, linters, environment contract, CI |
| Advanced harness | No | Core harness plus explicitly requested eval, trace, state, checkpoints, memory, or metrics |
| Documentation only | No | AGENTS.md and docs without linters, scripts, or CI |
When a user-confirmation tool is available, confirm scope. In Codex, use request_user_input.
On other platforms, use the equivalent user-choice tool. If no such tool is available, use the
detected context and record assumptions.
{
"question": "What's your priority for this harness setup?",
"header": "Scope",
"multiSelect": false,
"options": [
{
"label": "Core harness (Recommended)",
"description": "Project-first AGENTS.md, ECL changes, STATUS handoff, auto-evolve threshold checks, linters, environment contract, and strict CI"
},
{
"label": "Advanced harness",
"description": "Core harness plus explicitly requested eval, trace, memory, checkpoint, or metrics infrastructure"
},
{
"label": "Documentation only",
"description": "AGENTS.md and project docs only; skip linters, scripts, and CI for now"
}
]
}
If Empty project, also ask for basics:
{
"question": "What tech stack for this project?",
"header": "Tech Stack",
"multiSelect": false,
"options": [
{"label": "Go", "description": "CLI tools, high-performance services, system programming"},
{"label": "TypeScript/Node.js", "description": "Web APIs, full-stack apps, rapid prototyping"},
{"label": "Python", "description": "Data processing, ML/AI, scripting"}
]
}
If no user-confirmation tool is available, use detected values and document assumptions:
## Auto-Detected Context
| Field | Value | Confidence | Evidence |
|-------|-------|------------|----------|
| Tech Stack | {TECH} | High | Found {config file} |
| Project State | {state} | High | {criteria matched} |
| Scope | Core harness | Default | No user preference specified |
Proceeding with these assumptions. Tell me if any need adjustment.
1.5 ECL Work Intake Rules
When generating ECL guidance for a target project, keep the process small enough to use:
| Intake type | Criteria | Required ECL handling |
|---|---|---|
| Small Change | Local, low-risk edits such as copy, comments, style-only tweaks, or single-file bug fixes with no interface, data, permission, architecture, or release impact | Active change optional; still record the verification command in the final response or existing task notes |
| Structured Change | Cross-file/module behavior, APIs, data model, permissions, architecture, validation chain, unclear requirements, or work likely to exceed 20 minutes | Use active change files and require intake/spec/plan review before implementation |
Decision tree:
- If an active change already exists, keep using it; do not create a second active context.
- If the change is copy, comments, README text, formatting, or an obviously local single-file fix with no runtime, API, data, permission, architecture, or validation-chain impact, treat it as Small Change.
- If the change touches APIs, data, permissions, architecture, multiple modules, release/runtime behavior, or unclear requirements, treat it as Structured Change.
- If impact is unclear, do read-only investigation first. If uncertainty remains after inspection, ask one high-impact question or upgrade to Structured Change; do not assume Small Change.
For structured changes, support both common entry points:
- Requirement-first input: extract target users/scenarios, evidence, success criteria,
acceptance criteria, non-goals, constraints, assumptions, and risks into
spec.md. - Plan-first input: treat the user's plan as a draft, split WHAT/WHY into
spec.mdand HOW intoplan.md, then ask only about high-impact gaps that affect implementation direction or acceptance. If the plan is complete and does not conflict with repository evidence, do not repeat a full interview. If it conflicts with code, docs, commands, or existing harness constraints, record the conflict and return to Intake Review.
Questions are allowed and expected, but must be bounded: ask at most three high-impact questions per
round. Low-risk unknowns become assumptions; high-impact unknowns become
[NEEDS CLARIFICATION: ...] and block implementation until resolved.
For complex structured changes, use a lightweight iteration loop rather than treating the first spec as final:
Draft Spec -> Draft Plan -> Review Gaps -> Revise Spec/Plan -> Gate -> Tasks
Default to at most two loops. If key gaps remain, continue up to five loops; after that, record a
blocker instead of implementing from guesses. plan.md must include any planning-discovered spec
gaps, because plans often expose missing acceptance, boundary, permission, data, or validation
requirements.
Phase 2: Analysis
Goal: Deeply understand codebase architecture, harness state, and environment requirements.
2.1 Execution Mode
Use subagents only when the user authorized delegation and the environment supports it. Otherwise, execute the same responsibilities inline.
If using subagents, assign:
- Code architecture analysis: follow
agents/analyzer.md; outputharness/.analysis/architecture.json. - Harness state audit: follow
agents/auditor.md; outputharness/.analysis/audit.json. - Environment analysis: follow
references/environment-detection-guide.md; outputharness/.analysis/environment.json.
If working inline, produce the same three analysis artifacts or equivalent in-memory summaries before Phase 3.
2.2 Project Identity Extraction
For existing projects, extract target-project meaning before writing docs:
- One-sentence project identity: what it does and for whom.
- Core workflow or domain model: user/system flow, key entities, API resources, jobs, or commands.
- Primary source entrypoints and where common changes belong.
Use README.md, manifests, entrypoints, routes/controllers, schemas/models, and key source
directories. Harness files are not sufficient evidence for project identity.
2.3 Adapter Selection
After detecting the tech stack, load the matching adapter before creating linters, scripts, CI, or environment config. Adapter guidance overrides generic templates for language-specific details.
| Detected stack | Required adapter |
|---|---|
| TypeScript/Node.js | references/adapters/typescript.md |
| Go | references/adapters/go.md |
| Python | references/adapters/python.md |
| Rust | references/adapters/rust.md |
| Java | references/adapters/java.md |
| Unknown/mixed | references/adapters/generic.md plus any detected language adapters |
For TypeScript/Node.js projects, prefer Node/TS-native outputs: scripts/lint-deps.mjs or
equivalent, scripts/lint-quality.mjs, npm/package-manager scripts, and Node/TS GitHub Actions.
Do not adapt Go linter or Makefile-only patterns to TypeScript unless the project is actually Go
or already uses Makefile as the primary command surface.
2.4 Command Surface Selection
Before creating ECL scripts, select the target project's command surface. Do not assume PowerShell is the only Windows option. This selection is normally automatic; do not ask the user to choose a script format unless project evidence conflicts or the user has already expressed a hard constraint.
Priority:
- Existing project entrypoints: package-manager scripts, Makefile targets, README commands, or CI shell conventions.
- Explicit user/project constraints. If the project rejects
.ps1, do not generate PowerShell as the only harness entrypoint. - Bash profile when allowed. For Windows projects that accept Bash, generate
.shscripts and document the prerequisite: Git Bash, WSL, MSYS2, or a CI Linux runner. - PowerShell profile when the project accepts Windows-native PowerShell. Keep it compatible with Windows PowerShell 5.1 and PowerShell 7.
- Node or Python profiles when those runtimes are already first-class project dependencies.
Default when evidence is sparse: for TypeScript/Node projects choose Node/package-manager scripts; for Windows projects that allow Bash choose Bash profile and document Git Bash/WSL/MSYS2; otherwise choose the adapter's native lightweight scripting profile.
All profiles must implement the same ECL invariants and command set. harness-change,
harness-evolve, lint-ecl, and lint-encoding may be implemented as .ps1, .sh, .mjs,
or .py, but docs, CI, Makefile/package scripts, and verification commands must use the chosen
entrypoint consistently.
2.5 Wait for Analysis Completion
When subagents are running, wait for their final reports. While waiting, you can:
- Review any existing documentation
- Prepare templates for Phase 4
2.5 For Empty Projects
Skip Phase 2 analysis agents. Instead:
- Use templates from
references/greenfield-templates.md - Base decisions on user's tech stack choice
- Design a standard 3-layer architecture
Phase 3: Delta Synthesis
Goal: Merge analysis results and compute exactly what needs to be created/updated.
3.1 Read Analysis Results
cat harness/.analysis/architecture.json
cat harness/.analysis/audit.json
cat harness/.analysis/environment.json
3.2 Compute Delta
Create a delta list:
## Delta: What Needs to Be Done
### Core To Create (doesn't exist)
- [ ] AGENTS.md
- [ ] docs/ECL.md
- [ ] docs/STATUS.md
- [ ] docs/ARCHITECTURE.md
- [ ] scripts/lint-deps.go
- [ ] scripts/harness-change.{ps1|sh|mjs|py}
- [ ] scripts/harness-evolve.{ps1|sh|mjs|py}
- [ ] scripts/lint-ecl.{ps1|sh|mjs|py}
- [ ] scripts/lint-encoding.{ps1|sh|mjs|py}
- [ ] harness/changes/{active,parking,archive}
- [ ] harness/templates/change/
- [ ] harness/config/environment.json
- [ ] harness/evolution/{state.json,results.tsv,proposals/} (`pending.md` is generated later only when the archive threshold is reached)
### Optional Advanced (only if explicitly requested)
- [ ] harness/eval/ — agent evaluation datasets and runner inputs
- [ ] harness/trace/ — execution traces for agent runs
- [ ] harness/state/ — executor runtime state
- [ ] harness/checkpoints/ — resumable execution checkpoints
- [ ] harness/memory/ — long-term agent memory experiments
- [ ] harness/metrics/ — execution and quality metrics
### To Update (exists but has gaps)
- [ ] docs/DEVELOPMENT.md — missing build commands
- [ ] scripts/lint-quality.py — missing 3 packages in layer map
### Already Good (no changes needed)
- [x] Makefile — has all required targets
- [x] .github/workflows/ci.yml — properly configured
3.3 Confirm with User (if confirmation tool is available)
For significant changes:
{
"question": "I've analyzed the codebase. Ready to proceed with these changes?",
"header": "Confirm",
"multiSelect": false,
"options": [
{"label": "Yes, proceed with all", "description": "Create/update all identified items"},
{"label": "Show me the details first", "description": "I'll explain what each change involves"},
{"label": "Only critical items", "description": "Just P0/P1 items, skip P2/P3 for now"}
]
}
Phase 4: Creation/Update
Goal: Create or update all harness files from the delta.
4.1 Execution Mode
Use subagents only when authorized and available. Otherwise, perform the same work inline. Keep write scopes disjoint if using parallel workers.
Creation responsibilities:
- Documentation: follow
agents/creator-docs.md; create/update AGENTS.md, docs/ECL.md, docs/STATUS.md, docs/ARCHITECTURE.md, docs/DEVELOPMENT.md, and design docs. AGENTS.md is the target project's entry map, not a harness creation record. Keep the first screen project-first, but preserve ECL/current-change priority in context loading:AGENTS.md->docs/ECL.md-> active change if present -> auto-evolve pending if present -> otherwisedocs/STATUS.md-> task-specific project docs. - Linters: follow
agents/creator-linters.md; create/update dependency, quality, ECL, and encoding checks. - Config and scripts: follow
agents/creator-config.md; create/update environment contract, harness scripts, changes directories/templates, lightweight evolution state, harness-change, harness-evolve, Makefile targets, and CI. Create advanced directories only when the confirmed scope requires them.
ECL change templates must include summary.md, spec.md, plan.md, tasks.md, and
reviews/review.md. spec.md captures WHAT/WHY, plan.md captures HOW and planning-discovered
spec gaps, and tasks.md is generated only after the spec/plan gate is ready enough for
implementation. Do not require old archived changes to contain plan.md; compatibility applies to
history.
Important: do not create static verification config such as harness/config/verify.json. Verification plans are generated at runtime by the executor from environment.json and the task context.
Strict CI rule: default CI must include normal business quality gates (lint, typecheck, test,
build, and backend/package-specific equivalents when available) plus harness checks. Do not remove
or skip business gates because the baseline is red. If the baseline was already red, explain that CI
will be red until the pre-existing project issues are fixed. Generate staged or relaxed CI only when
the user explicitly asks for it.
Command surface rule: create ECL scripts for the selected profile, not a hardcoded shell. If Bash is
selected on Windows, document Git Bash, WSL, MSYS2, or CI Linux shell requirements in the generated
environment/development docs. If PowerShell is selected, detect whether pwsh is available; if not,
use powershell -NoProfile -ExecutionPolicy Bypass. PowerShell templates must be compatible with
Windows PowerShell 5.1: avoid ambiguous overloads such as TrimStart(".\"), and avoid non-ASCII
mojibake marker string literals in .ps1; represent markers by Unicode codepoint or another
PS5-safe construction.
4.2 For Empty Projects: Also Create Business Code Plan
For empty projects, add one more agent:
Agent("create-exec-plan", prompt="""
Create execution plan for business code (harness-executor will implement this):
Tech stack: {TECH}
Project type: {from user choice}
Architecture: 3-layer (Types → Core → Entry Points)
Create: docs/exec-plans/active/bootstrap-code.md
Contents:
- Full source code for initial project structure
- main.go/index.ts/main.py entry point
- Basic types and core logic
- Test files
This is for harness-executor to implement — not ecl-harness-engineer's responsibility.
""")
4.3 Wait for Creation Completion
Agents will notify when done. Collect any issues they encountered.
Phase 5: Verification + Handoff
Goal: Ensure everything works, then hand off or present results.
5.1 Run Verification
# 0. Compare against the baseline snapshot
# Re-run the same existing lint/typecheck/test/build commands captured in Phase 1.
# 1. Harness checks pass
make verify-harness || npm run lint:harness || {generated_harness_lint_command}
# 2. Architecture linters pass
make lint-arch || npm run lint:arch
# 3. Business build/test gates run
go build ./... || npm run build || python -m compileall .
# 4. AGENTS.md size check
wc -l AGENTS.md # Should be 80-120 lines
# 4b. AGENTS.md content gate
# Confirm it explains project identity, core workflow/domain model, source entrypoints,
# task-based verification, active-change-before-STATUS loading, and contains no
# ECL Harness Engineer internal boundary language.
# 5. All expected files exist
test -f AGENTS.md && echo "✓ AGENTS.md"
test -f docs/ARCHITECTURE.md && echo "✓ ARCHITECTURE.md"
test -f docs/ECL.md && echo "✓ ECL.md"
test -f docs/STATUS.md && echo "✓ STATUS.md"
test -f scripts/lint-deps* && echo "✓ lint-deps"
test -f scripts/harness-change.* && echo "✓ harness-change"
test -f scripts/lint-ecl.* && echo "✓ lint-ecl"
test -f scripts/harness-evolve.* && echo "✓ harness-evolve"
test -d harness/ && echo "✓ harness/"
test -d harness/changes && echo "✓ harness/changes"
test -f harness/evolution/state.json && echo "✓ evolution state"
# 6. Design docs exist (not just index)
find docs/design-docs -name "*.md" ! -name "index.md" | wc -l
Classify every verification result:
| Classification | Meaning |
|---|---|
| Harness pass | Harness-created checks/files/scripts work |
| Pre-existing project failure | The same command failed in the Phase 1 baseline |
| New regression | The command passed in Phase 1 and fails after harness creation |
| Not available | The command/script does not exist in this project |
AGENTS.md content gate:
- A new agent can tell what the project does within 30 seconds.
- The core product/system workflow or domain model is visible.
- Main source entrypoints and task-to-directory mapping are visible.
- Verification guidance maps to task type.
- Context loading reads
docs/ECL.mdfirst, then active change when present. - If no active change exists and
harness/evolution/pending.mdexists, read it beforedocs/STATUS.md, mention it as pending maintenance, and ask whether to handle it now unless the user already prioritized the current task. Reading or asking does not start auto-evolve and must not block ordinary user work. - If no active change exists and no pending evolution exists, context loading reads
docs/STATUS.mdbefore task-specific project docs. - For structured work,
docs/ECL.mdexplains Small Change vs Structured Change, bounded Intake Review, plan-first input handling, and the spec/plan review gate. - Archive history is loaded selectively through
docs/STATUS.mdpaths orharness/changes/INDEX.json, starting with historicalsummary.mdonly. - No skill-internal boundary leaks, such as sections or sentences that describe this skill's own scope limits as target-project rules.
5.2 STATUS.md Handoff Update
When a target project uses ECL changes, maintain docs/STATUS.md as a lightweight handoff file.
It is not the authority while an active change exists, but it becomes the default recent-history
entry point after the active change is closed.
Close-change handoff protocol:
- Before running
harness-change close, read the active changesummary.md,spec.md,plan.md,tasks.md, and relevantreviews/; updatedocs/STATUS.mdwith completed work, verification results, residual risks, and the next recommended resume point. - Run the close command so the active change moves to
harness/changes/archive/...andharness/changes/INDEX.jsonis rebuilt. - After close, update
docs/STATUS.mdagain with the final archive path, normally pointing to the archivedsummary.md. - Run the harness lint command (
npm run lint:harness,make verify-harness, or the generated ECL lint command) to confirm STATUS, ECL structure, and INDEX state are consistent.
Hooks and CI may validate docs/STATUS.md, but must not auto-write it or move changes.
5.3 Auto-Evolve Check
Core harnesses include lightweight auto-evolve by default. The script layer only detects when
enough new archive evidence exists and writes harness/evolution/pending.md; Codex performs the
semantic improvement pass.
Trigger model: harness-change close and reindex run harness-evolve check; new only reminds
when pending exists. Hooks and CI may warn, but must not modify docs, scripts, STATUS, or changes.
Generated scripts do not call subagents. They only count archive evidence and create pending
context. When no active change exists and Codex notices pending maintenance, it should ask the user
whether to handle it now unless the user already prioritized the current task. Asking does not start
pending evolution.
harness/evolution/pending.md is a maintenance reminder, not a hard lock. Reading it for context
does not start pending evolution. Pending evolution starts only when Codex creates or uses an
auto-evolve-harness-* change, writes an evolution proposal/result, or edits Harness files based
on the pending evidence. Once started, finish with a proposal, one harness/evolution/results.tsv
row, and harness-evolve mark-complete; otherwise park or close blocked, not completed.
Apply only the smallest evidence-backed delta that passes review. No independent scorer =
no auto-apply: user approval to handle pending implies permission to request an independent
auditor/subagent when the environment supports it. If the environment still requires explicit
authorization, ask once. If scoring is unavailable, declined, or still unauthorized after asking,
record noop with eval_mode=dry_run, keep the proposal, run mark-complete, and stop.
Machinery repair
(harness-evolve, pending templates, lint) does not complete pending evolution by itself; after
repair, still evaluate candidate archives or leave the work parked/blocked.
Detailed proposal format, scoring weights, status values, and complexity budget live in
references/ecl-harness.md.
5.4 Present Summary
## Harness Infrastructure Complete
**Project**: {project-name}
**Tech Stack**: {TECH}
**Files Created/Updated**: {count}
### Created Files
- AGENTS.md ({N} lines)
- docs/ARCHITECTURE.md
- docs/ECL.md
- docs/STATUS.md
- docs/DEVELOPMENT.md
- docs/design-docs/{component}.md
- scripts/lint-deps.{ext}
- scripts/lint-quality.{ext}
- scripts/harness-change.{ps1|sh|mjs|py}
- scripts/lint-ecl.{ps1|sh|mjs|py}
- scripts/lint-encoding.{ps1|sh|mjs|py}
- scripts/harness-evolve.{ps1|sh|mjs|py}
- harness/config/environment.json
- harness/changes/
- harness/evolution/
- harness/templates/change/
- Makefile
### Verification Results
- Harness checks: ✓
- Architecture checks: ✓
- Business gates: ✓ or pre-existing failures listed below
- AGENTS.md size: ✓ ({N} lines)
### Pre-existing Project Failures
- {List baseline-red commands and short reasons, or "None observed."}
### New Regressions Introduced By Harness
- {List commands that passed before and failed after, or "None observed."}
### Next Steps
{For empty projects: "Run harness-executor to implement business code from docs/exec-plans/active/bootstrap-code.md"}
{For existing projects: "The harness is ready. AI agents can now use AGENTS.md as their entry point."}
5.5 Automatic Handoff (for Empty Projects)
If this was an empty project with a bootstrap exec-plan, invoke harness-executor:
Skill(skill="harness-executor")
With context: "Implement the bootstrap exec-plan at docs/exec-plans/active/bootstrap-code.md"
Core Principles
1. Repository as Single Source of Truth
Agents cannot access Slack, Google Docs, or tribal knowledge. If it's not in the repository, it doesn't exist for the agent.
2. AGENTS.md is a Map, Not a Manual
Keep it 80-120 lines. Link to detailed docs, don't embed them.
3. Enforce Invariants Mechanically
Linter errors must be agent-actionable:
✗ BAD: "Forbidden import in core/types/user.go"
✓ GOOD: "core/types/user.go:15 imports core/config (layer 0 → layer 2).
Layer 0 packages must have NO internal dependencies.
Fix options:
1. Move config-dependent logic to a higher layer
2. Pass the config value as a parameter
3. Use dependency injection via an interface"
4. Build to Delete
Every component should be replaceable. Capabilities that required complex pipelines yesterday may be single prompts tomorrow.
5. Start Simple
Atomic, well-documented tools > complex agent choreography. Don't over-engineer.
6. Change State Is Explicit
Use a single harness/changes/active/ task for personal development. Move paused work to parking/ and closed work to archive/ with the generated scripts/harness-change.* command. Maintain docs/STATUS.md as the soft handoff summary after active work is closed. Never hand-edit harness/changes/INDEX.json; it is a generated index rebuilt by park, close, resume, and reindex. Structured changes use spec.md for WHAT/WHY, plan.md for HOW, and tasks.md for executable work.
7. Harness Evolves From Evidence
Every few closed changes, the generated scripts/harness-evolve.* check command may create
harness/evolution/pending.md. Treat it as a maintenance reminder to improve harness rules from
real archived evidence, not as a hard blocker for unrelated user work. If you start acting on the
pending evidence, first refresh harness/changes/INDEX.json and use the current eligible archive
window; the Candidate Archives in an old pending file are a trigger snapshot, not the only evidence.
Then finish with proposal + results.tsv + mark-complete, or park/block the work.
Do not turn one-off business bugs into permanent process. Keep only changes that improve the audit
score and pass validation.
Reference Files
| File | When to Read | Contents |
|---|---|---|
references/greenfield-templates.md | Empty projects (Phase 2.5) | Complete Go/TS/Python scaffolding |
references/documentation-templates.md | Phase 4 doc creation | Doc templates with numbered sections |
references/linter-templates.md | Phase 4 linter creation | Linter code templates per language |
references/ecl-harness.md | ECL-aware harness creation | docs/ECL.md, docs/STATUS.md, change lifecycle, INDEX.json, PowerShell script templates |
references/darwin-eval-prompts.md | Skill quality evaluation | Dry-run prompts for darwin-skill review |
references/environment-detection-guide.md | Phase 2 env analysis | Environment ecosystem detection |
references/environment-config-guide.md | Phase 4 config creation | Startup, services, env vars, user-confirmation templates |
references/adapters/typescript.md | TypeScript/Node.js projects | npm scripts, Node linters, package-manager detection, CI defaults |
references/adapters/{go,python,rust,java,generic}.md | Matching detected stacks | Language-specific commands and conventions |
Agent prompts for Phase 2 and Phase 4 subagents are in agents/.
For small projects (< 20 files) or when subagents aren't available, execute phases inline instead of spawning agents.