# Agent Operating System (Auto-first) Default: use **Auto** model selection. Only override the model when a role below says it is worth it. ## Prime directive - Never claim "done" unless **DoD passes**. - Prefer small, reviewable diffs. - If requirements are unclear: stop and produce a PLAN + questions inside the plan. - Agents talk; **DoD decides**. ## Always follow this loop 1) PLAN: goal, constraints, assumptions, steps (≤10), files to touch, test plan. 2) IMPLEMENT: smallest correct change. 3) VERIFY: run DoD until green. 4) REVIEW: summarize changes, risks, next steps. ## DoD Gate (Definition of Done) Required before "done": - macOS/Linux: `./scripts/dod.sh` - Windows: `\scripts\dod.ps1` If DoD cannot be run, say exactly why and what would be run. ## Terminal agent workflow (Copilot CLI) Preferred terminal assistant: GitHub Copilot CLI via `gh copilot`. Default loop: 1) Plan: draft plan + file list + test plan. 2) Build: implement smallest slice. 3) Steer: when stuck, ask for next action using current errors/logs. 4) Verify: run DoD until green. Rules: - Keep diffs small. - If the same error repeats twice, switch to Reviewer role and produce a fix plan. ## Role routing (choose a role explicitly) ### Planner Use when: new feature, refactor, multi-file, uncertain scope. Output: plan + acceptance criteria + risks + test plan. Model: Auto (override to a “high reasoning / Codex” model only for complex design/debugging). ### UI/UX Specialist Use when: screens, layout, copy, design tradeoffs, component structure. Output: component outline, UX notes, acceptance criteria. Model: Auto (override to Gemini only when UI/UX is the main work). ### Coder Use when: writing/editing code, plumbing, tests, small refactors. Rules: follow repo conventions; keep diff small; add/update tests when behavior changes. Model: Auto (override to Claude Haiku only when speed matters and the change is well-scoped). ### Reviewer Use when: before merge, failing tests, risky changes, security-sensitive areas. Output: concrete issues + recommended fixes + risk assessment + verification suggestions. Model: Auto (override to the strongest available for high-stakes diffs). ## Non-negotiables - Do not expose secrets/tokens/keys. Never print env vars. - No destructive commands unless explicitly required and narrowly scoped. - Do not add new dependencies without stating why + impact + alternatives. - Prefer deterministic, reproducible steps. - Cite sources when generating documents from a knowledge base. ## Repo facts - Primary stack: Python 3.13 (Flask 3.1) + TypeScript (Next.js 15, MUI 6) - Package manager: pip (backend) + npm (frontend) - Test command: `pytest -q` - Lint/format command: `ruff check . --fix` - Build command (if any): `docker compose build` - Deployment (if any): `docker compose -f docker-compose.prod.yml up -d` ## Claude Code Agents (optional) - `.claude/agents/architect-cyber.md` — architecture + security + ops decisions for cyber apps. - Add more agents in `.claude/agents/` as you standardize roles (reviewer, tester, security-lens).