dev backbone template

2026-03-01 05:50:22 -05:00 · 2026-02-02 14:12:33 -05:00
commit 1fddc3574f
37 changed files with 1222 additions and 0 deletions
--- a/.claude/agents/architect-cyber.md
+++ b/.claude/agents/architect-cyber.md
@@ -0,0 +1,150 @@
 ---
 name: architect-cyber
 description: >
  Proactively design and govern architecture for cyber apps (Threat Hunt, Cyber Goose, Cyber Intel, PAD generator).
  Use for new subsystems, major refactors, data model changes, auth/tenancy, and any new integration.
 ---
 # Architect (Cyber) — playbook
 ## Mission
 Design a maintainable, secure, observable system that agents and humans can ship safely.
 You are not a code generator first — you are a *decision generator*:
 - make boundaries crisp,
 - make data flow explicit,
 - make risks boring,
 - make “done” measurable.
 ## Inputs you should ask for (but don’t block if missing)
 - Primary user + workflow (1–3 sentences)
 - Constraints: offline/air-gapped? data sensitivity? deployment target?
 - Existing stack choices (default if unknown): **Next.js + MUI portal**, **MCP tool spine**, **RAG for knowledge**, **DoD gates**.
 - Integration list (APIs, DBs, agents/models, auth provider)
 ## Outputs (always produce these)
 1) **Architecture sketch** (components + responsibilities)
 2) **Data flow** (what moves where, and why)
 3) **Security posture** (threat model + guardrails)
 4) **Operational plan** (logging/tracing/metrics + runbook basics)
 5) **ADR(s)** for any non-trivial choice
 6) **Implementation plan** (phased, with checkpoints)
 7) **DoD gates** (tests + checks that define “done”)
 ---
 ## Default architecture (use unless there’s a reason not to)
 ### Layers
 1) **Portal UI (MUI)**  
   - Deterministic workflows first; chat is secondary.
   - Prefer “AI → JSON → UI” patterns for agent outputs when helpful.
 2) **API service**
   - Thin orchestration: auth, input validation, calling MCP tools, calling RAG.
   - Keep business logic in well-tested modules, not in route handlers.
 3) **MCP Tool Spine (Cyber MCP Gateway)**
   - Outcome-oriented tools (hunt.*, intel.*, pad.*).
   - Flat args, strict schemas, pagination, clear errors.
   - Progressive disclosure: safe tools by default; “dangerous” tools require explicit unlock.
 4) **RAG / Knowledge**
   - Curated corpus + citations.
   - Prefer incremental ingestion and quality metrics over “ingest everything”.
 5) **Storage**
   - Postgres for app state (cases, runs, users, configs)
   - Object storage for artifacts (uploads, exports)
   - Vector DB for embeddings (can be Postgres+pgvector or dedicated, but keep the interface stable)
 ---
 ## Process (follow this order)
 ### 1) Clarify the “job”
 - What is the user trying to accomplish in 2 minutes?
 - What are the top 3 screens/actions?
 ### 2) Identify trust boundaries
 - What data is sensitive?
 - Where is execution allowed?
 - Who can trigger “run commands” or “touch infra”?
 ### 3) Define domain objects (nouns)
 For cyber tools, typical objects:
 - Case, Evidence, Artifact, Indicator (IOC), Finding, Detection, Query, Run, Source, Confidence, Citation.
 ### 4) Define pipelines (verbs)
 Threat hunt pipeline default:
 - ingest → normalize → enrich → index → query → analyze → report/export
 PAD pipeline default:
 - collect inputs → outline → section drafts → compliance checks → citations → export
 ### 5) Choose interfaces first
 - MCP tool contracts (schemas + examples)
 - API endpoints (if needed)
 - UI component contracts (JSON render schema if used)
 ### 6) Produce ADRs
 Use small ADRs (one per decision). Include:
 - context, decision, alternatives, consequences, reversibility.
 ### 7) Define DoD gates (non-negotiable)
 Minimum:
 - format + lint
 - typecheck (TS) / static check (Python)
 - unit tests for core logic
 - integration test for at least one end-to-end happy path
 - secret scanning / dependency scanning
 - logging + trace correlation IDs on API requests
 ---
 ## Cyber-specific checklists
 ### Security checklist (minimum bar)
 - AuthN + AuthZ: who can do what?
 - Audit logging for privileged actions (exports, deletes, tool unlocks)
 - Secrets: never in repo; use env/secret manager; rotate strategy
 - Input validation everywhere (uploads, URLs, query params)
 - Safe tool mode by default (read-only, limited scope)
 - Clear “permission boundary” text in UI for destructive actions
 ### Tool safety checklist (MCP)
 - Tool scopes/roles (viewer, analyst, admin)
 - Rate limits for expensive tools
 - Deterministic error handling (no stack traces to client)
 - Replayability: tool calls logged with inputs + outputs (redact secrets)
 ### Observability checklist
 - Structured logs (JSON)
 - OpenTelemetry traces across: UI action → API → MCP tool → RAG
 - Metrics: latency, error rate, token usage, cost/throughput, queue depth
 - “Why did the agent do that?” debug trail (plan + tool calls + citations)
 ### UX checklist (portal)
 - Default to workflow pages (Cases, Evidence, Runs, Reports)
 - Every AI output must be: editable, citeable, exportable
 - Show confidence + sources when claiming facts
 - One-click “Generate report artifact” and “Copy as markdown”
 ---
 ## Red flags (stop and redesign)
 - “One mega tool” that does everything
 - Agents writing directly to prod databases
 - No tests, no gates, but “agent says done”
 - Unbounded ingestion (“let’s embed 40TB tonight”)
 - No citations for knowledge-based answers
 ---
 ## Final format (what you deliver to the team)
 Provide, in order:
 1) 1-page overview
 2) Component diagram (text-based is fine)
 3) ADR list
 4) Phase plan with milestones
 5) DoD gates checklist
--- a/.github/copilot-instructions.md
+++ b/.github/copilot-instructions.md
@@ -0,0 +1,9 @@
 Follow `AGENTS.md` and `SKILLS.md`.
 Rules:
 - Use the PLAN → IMPLEMENT → VERIFY → REVIEW loop.
 - Keep model selection on Auto unless AGENTS.md role routing says to override for that role.
 - Never claim "done" unless DoD passes (`./scripts/dod.sh` or `\scripts\dod.ps1`).
 - Keep diffs small and add/update tests when behavior changes.
 - Prefer reproducible commands and cite sources for generated documents.
--- a/.github/workflows/dod.yml
+++ b/.github/workflows/dod.yml
@@ -0,0 +1,17 @@
 name: DoD Gate
 on:
  pull_request:
  push:
    branches: [ main, master ]
 jobs:
  dod:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Run DoD
        run: |
          chmod +x ./scripts/dod.sh || true
          ./scripts/dod.sh
--- a/.gitignore
+++ b/.gitignore
@@ -0,0 +1,3 @@
 # profiling artifacts
 profiles/
--- a/.gitlab-ci.yml
+++ b/.gitlab-ci.yml
@@ -0,0 +1,15 @@
 stages: [dod]
 dod:
  stage: dod
  image: ubuntu:24.04
  before_script:
    - apt-get update -y
    - apt-get install -y bash git ca-certificates curl python3 python3-pip nodejs npm jq
  script:
    - chmod +x ./scripts/dod.sh || true
    - ./scripts/dod.sh
  rules:
    - if: $CI_PIPELINE_SOURCE == "merge_request_event"
    - if: $CI_COMMIT_BRANCH
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -0,0 +1,78 @@
 # Agent Operating System (Auto-first)
 Default: use **Auto** model selection. Only override the model when a role below says it is worth it.
 ## Prime directive
 - Never claim "done" unless **DoD passes**.
 - Prefer small, reviewable diffs.
 - If requirements are unclear: stop and produce a PLAN + questions inside the plan.
 - Agents talk; **DoD decides**.
 ## Always follow this loop
 1) PLAN: goal, constraints, assumptions, steps (≤10), files to touch, test plan.
 2) IMPLEMENT: smallest correct change.
 3) VERIFY: run DoD until green.
 4) REVIEW: summarize changes, risks, next steps.
 ## DoD Gate (Definition of Done)
 Required before "done":
 - macOS/Linux: `./scripts/dod.sh`
 - Windows: `\scripts\dod.ps1`
 If DoD cannot be run, say exactly why and what would be run.
 ## Terminal agent workflow (Copilot CLI)
 Preferred terminal assistant: GitHub Copilot CLI via `gh copilot`.
 Default loop:
 1) Plan: draft plan + file list + test plan.
 2) Build: implement smallest slice.
 3) Steer: when stuck, ask for next action using current errors/logs.
 4) Verify: run DoD until green.
 Rules:
 - Keep diffs small.
 - If the same error repeats twice, switch to Reviewer role and produce a fix plan.
 ## Role routing (choose a role explicitly)
 ### Planner
 Use when: new feature, refactor, multi-file, uncertain scope.
 Output: plan + acceptance criteria + risks + test plan.
 Model: Auto (override to a “high reasoning / Codex” model only for complex design/debugging).
 ### UI/UX Specialist
 Use when: screens, layout, copy, design tradeoffs, component structure.
 Output: component outline, UX notes, acceptance criteria.
 Model: Auto (override to Gemini only when UI/UX is the main work).
 ### Coder
 Use when: writing/editing code, plumbing, tests, small refactors.
 Rules: follow repo conventions; keep diff small; add/update tests when behavior changes.
 Model: Auto (override to Claude Haiku only when speed matters and the change is well-scoped).
 ### Reviewer
 Use when: before merge, failing tests, risky changes, security-sensitive areas.
 Output: concrete issues + recommended fixes + risk assessment + verification suggestions.
 Model: Auto (override to the strongest available for high-stakes diffs).
 ## Non-negotiables
 - Do not expose secrets/tokens/keys. Never print env vars.
 - No destructive commands unless explicitly required and narrowly scoped.
 - Do not add new dependencies without stating why + impact + alternatives.
 - Prefer deterministic, reproducible steps.
 - Cite sources when generating documents from a knowledge base.
 ## Repo facts (fill these in)
 - Primary stack:
 - Package manager:
 - Test command:
 - Lint/format command:
 - Build command (if any):
 - Deployment (if any):
 ## Claude Code Agents (optional)
 - `.claude/agents/architect-cyber.md` — architecture + security + ops decisions for cyber apps.
 - Add more agents in `.claude/agents/` as you standardize roles (reviewer, tester, security-lens).
--- a/AGENT_LOG.md
+++ b/AGENT_LOG.md
@@ -0,0 +1,13 @@
 # Agent Handoff Log
 Use this file when multiple agents (or humans) are working in parallel.
 ## Entry template
 - Date/time:
 - Branch/worktree:
 - Role: (Planner / UI / Coder / Reviewer)
 - What changed:
 - Commands run + results:
 - Current blockers:
 - Next step:
--- a/README.md
+++ b/README.md
@@ -0,0 +1,32 @@
 # Dev Backbone Template
 Drop-in baseline for:
 - `AGENTS.md` + `SKILLS/` (consistent agent behavior)
 - DoD gates (`scripts/dod.sh`, `scripts/dod.ps1`)
 - Copilot instructions
 - CI gates (GitHub Actions + GitLab CI)
 ## Use for NEW projects
 1) Make this repo a GitHub **Template repository**.
 2) Create new repos from the template.
 ## Apply to EXISTING repos
 Copy these into the repo root:
 - AGENTS.md
 - SKILLS.md
 - SKILLS/
 - scripts/
 - .github/copilot-instructions.md
 - (optional) .github/workflows/dod.yml
 - (optional) .gitlab-ci.yml
 Commit and push.
 ## Run DoD
 - macOS/Linux: `./scripts/dod.sh`
 - Windows: `\scripts\dod.ps1`
 ## Notes
 - DoD scripts auto-detect Node/Python and run what exists.
 - Customize per repo for extra checks (docker build, e2e, mypy, etc.).
--- a/SKILLS.md
+++ b/SKILLS.md
@@ -0,0 +1,23 @@
 # Skills Index
 These skill files define repeatable behaviors for agents and humans.
 Agents must follow them in this order:
 1) SKILLS/00-operating-model.md
 2) SKILLS/05-agent-taxonomy.md
 3) SKILLS/10-definition-of-done.md
 4) SKILLS/20-repo-map.md (use whenever unfamiliar with the repo)
 5) SKILLS/25-algorithms-performance.md
 6) SKILLS/26-vibe-coding-fundamentals.md
 7) SKILLS/27-performance-profiling.md
 8) SKILLS/30-implementation-rules.md
 9) SKILLS/40-testing-quality.md
 10) SKILLS/50-pr-review.md
 11) SKILLS/56-ui-material-ui.md (for React/Next portal-style apps)
 12) SKILLS/60-security-safety.md
 13) SKILLS/70-docs-artifacts.md
 14) SKILLS/82-mcp-server-design.md (when building MCP servers/tools)
 15) SKILLS/83-fastmcp-3-patterns.md (if using FastMCP 3)
 16) SKILLS/80-mcp-tools.md (if this repo has MCP tools)
 Rule: If anything conflicts, **AGENTS.md wins**.
--- a/SKILLS/00-operating-model.md
+++ b/SKILLS/00-operating-model.md
@@ -0,0 +1,21 @@
 # Operating Model
 ## Default cadence
 - Prefer iterative progress over big bangs.
 - Keep diffs small: target ≤ 300 changed lines per PR unless justified.
 - Update tests/docs as part of the same change when possible.
 ## Working agreement
 - Start with a PLAN for non-trivial tasks.
 - Implement the smallest slice that satisfies acceptance criteria.
 - Verify via DoD.
 - Write a crisp PR summary: what changed, why, and how verified.
 ## Stop conditions (plan first)
 Stop and produce a PLAN (do not code yet) if:
 - scope is unclear
 - more than 3 files will change
 - data model changes
 - auth/security boundaries
 - performance-critical paths
--- a/SKILLS/05-agent-taxonomy.md
+++ b/SKILLS/05-agent-taxonomy.md
@@ -0,0 +1,36 @@
 # Agent Types & Roles (Practical Taxonomy)
 Use this skill to choose the *right* kind of agent workflow for the job.
 ## Common agent “types” (in practice)
 ### 1) Chat assistant (no tools)
 Best for: explanations, brainstorming, small edits.
 Risk: can hallucinate; no grounding in repo state.
 ### 2) Tool-using single agent
 Best for: well-scoped tasks where the agent can read/write files and run commands.
 Key control: strict DoD gates + minimal permissions.
 ### 3) Planner + Executor (2-role pattern)
 Best for: medium complexity work (multi-file changes, feature work).
 Flow: Planner writes plan + acceptance criteria → Executor implements → Reviewer checks.
 ### 4) Multi-agent (specialists)
 Best for: bigger features with separable workstreams (UI, backend, docs, tests).
 Rule: isolate context per role; use separate branches/worktrees.
 ### 5) Supervisor / orchestrator
 Best for: long-running workflows with checkpoints (pipelines, report generation, PAD docs).
 Rule: supervisor delegates, enforces gates, and composes final output.
 ## Decision rules (fast)
 - If you can describe it in ≤ 5 steps → single tool-using agent.
 - If you need tradeoffs/design → Planner + Executor.
 - If UI + backend + docs/tests all move → multi-agent specialists.
 - If it’s a pipeline that runs repeatedly → orchestrator.
 ## Guardrails (always)
 - DoD is the truth gate.
 - Separate branches/worktrees for parallel work.
 - Log decisions + commands in AGENT_LOG.md.
--- a/SKILLS/10-definition-of-done.md
+++ b/SKILLS/10-definition-of-done.md
@@ -0,0 +1,24 @@
 # Definition of Done (DoD)
 A change is "done" only when:
 ## Code correctness
 - Builds successfully (if applicable)
 - Tests pass
 - Linting/formatting passes
 - Types/checks pass (if applicable)
 ## Quality
 - No new warnings introduced
 - Edge cases handled (inputs validated, errors meaningful)
 - Hot paths not regressed (if applicable)
 ## Hygiene
 - No secrets committed
 - Docs updated if behavior or usage changed
 - PR summary includes verification steps
 ## Commands
 - macOS/Linux: `./scripts/dod.sh`
 - Windows: `\scripts\dod.ps1`
--- a/SKILLS/20-repo-map.md
+++ b/SKILLS/20-repo-map.md
@@ -0,0 +1,16 @@
 # Repo Mapping Skill
 When entering a repo:
 1) Read README.md
 2) Identify entrypoints (app main / server startup / CLI)
 3) Identify config (env vars, .env.example, config files)
 4) Identify test/lint scripts (package.json, pyproject.toml, Makefile, etc.)
 5) Write a 10-line "repo map" in the PLAN before changing code
 Output format:
 - Purpose:
 - Key modules:
 - Data flow:
 - Commands:
 - Risks:
--- a/SKILLS/25-algorithms-performance.md
+++ b/SKILLS/25-algorithms-performance.md
@@ -0,0 +1,20 @@
 # Algorithms & Performance
 Use this skill when performance matters (large inputs, hot paths, or repeated calls).
 ## Checklist
 - Identify the **state** you’re recomputing.
 - Add **memoization / caching** when the same subproblem repeats.
 - Prefer **linear scans** + caches over nested loops when possible.
 - If you can write it as a **recurrence**, you can test it.
 ## Practical heuristics
 - Measure first when possible (timing + input sizes).
 - Optimize the biggest wins: avoid repeated I/O, repeated parsing, repeated network calls.
 - Keep caches bounded (size/TTL) and invalidate safely.
 - Choose data structures intentionally: dict/set for membership, heap for top-k, deque for queues.
 ## Review notes (for PRs)
 - Call out accidental O(n²) patterns.
 - Suggest table/DP or memoization when repeated work is obvious.
 - Add tests that cover base cases + typical cases + worst-case size.
--- a/SKILLS/26-vibe-coding-fundamentals.md
+++ b/SKILLS/26-vibe-coding-fundamentals.md
@@ -0,0 +1,31 @@
 # Vibe Coding With Fundamentals (Safety Rails)
 Use this skill when you’re using “vibe coding” (fast, conversational building) but want production-grade outcomes.
 ## The good
 - Rapid scaffolding and iteration
 - Fast UI prototypes
 - Quick exploration of architectures and options
 ## The failure mode
 - “It works on my machine” code with weak tests
 - Security foot-guns (auth, input validation, secrets)
 - Performance cliffs (accidental O(n²), repeated I/O)
 - Unmaintainable abstractions
 ## Safety rails (apply every time)
 - Always start with acceptance criteria (what “done” means).
 - Prefer small PRs; never dump a huge AI diff.
 - Require DoD gates (lint/test/build) before merge.
 - Write tests for behavior changes.
 - For anything security/data related: do a Reviewer pass.
 ## When to slow down
 - Auth/session/token work
 - Anything touching payments, PII, secrets
 - Data migrations/schema changes
 - Performance-critical paths
 - “It’s flaky” or “it only fails in CI”
 ## Practical prompt pattern (use in PLAN)
 - “State assumptions, list files to touch, propose tests, and include rollback steps.”
--- a/SKILLS/27-performance-profiling.md
+++ b/SKILLS/27-performance-profiling.md
@@ -0,0 +1,31 @@
 # Performance Profiling (Bun/Node)
 Use this skill when:
 - a hot path feels slow
 - CPU usage is high
 - you suspect accidental O(n²) or repeated work
 - you need evidence before optimizing
 ## Bun CPU profiling
 Bun supports CPU profiling via `--cpu-prof` (generates a `.cpuprofile` you can open in Chrome DevTools).
 Upcoming: `bun --cpu-prof-md <script>` outputs a CPU profile as **Markdown** so LLMs can read/grep it easily.
 ### Workflow (Bun)
 1) Run the workload with profiling enabled
   - Today: `bun --cpu-prof ./path/to/script.ts`
   - Upcoming: `bun --cpu-prof-md ./path/to/script.ts`
 2) Save the output (or `.cpuprofile`) into `./profiles/` with a timestamp.
 3) Ask the Reviewer agent to:
   - identify the top 5 hottest functions
   - propose the smallest fix
   - add a regression test or benchmark
 ## Node CPU profiling (fallback)
 - `node --cpu-prof ./script.js` writes a `.cpuprofile` file.
 - Open in Chrome DevTools → Performance → Load profile.
 ## Rules
 - Optimize based on measured hotspots, not vibes.
 - Prefer algorithmic wins (remove repeated work) over micro-optimizations.
 - Keep profiling artifacts out of git unless explicitly needed (use `.gitignore`).
--- a/SKILLS/30-implementation-rules.md
+++ b/SKILLS/30-implementation-rules.md
@@ -0,0 +1,16 @@
 # Implementation Rules
 ## Change policy
 - Prefer edits over rewrites.
 - Keep changes localized.
 - One change = one purpose.
 - Avoid unnecessary abstraction.
 ## Dependency policy
 - Default: do not add dependencies.
 - If adding: explain why, alternatives considered, and impact.
 ## Error handling
 - Validate inputs at boundaries.
 - Error messages must be actionable: what failed + what to do next.
--- a/SKILLS/40-testing-quality.md
+++ b/SKILLS/40-testing-quality.md
@@ -0,0 +1,14 @@
 # Testing & Quality
 ## Strategy
 - If behavior changes: add/update tests.
 - Unit tests for logic; integration tests for boundaries; E2E only where needed.
 ## Minimum for every PR
 - A test plan in the PR summary (even if “existing tests cover this”).
 - Run DoD.
 ## Flaky tests
 - Capture repro steps.
 - Quarantine only with justification + follow-up issue.
--- a/SKILLS/50-pr-review.md
+++ b/SKILLS/50-pr-review.md
@@ -0,0 +1,16 @@
 # PR Review Skill
 Reviewer must check:
 - Correctness: does it do what it claims?
 - Safety: secrets, injection, auth boundaries
 - Maintainability: readability, naming, duplication
 - Tests: added/updated appropriately
 - DoD: did it pass?
 Reviewer output format:
 1) Summary
 2) Must-fix
 3) Nice-to-have
 4) Risks
 5) Verification suggestions
--- a/SKILLS/56-ui-material-ui.md
+++ b/SKILLS/56-ui-material-ui.md
@@ -0,0 +1,41 @@
 # Material UI (MUI) Design System
 Use this skill for any React/Next “portal/admin/dashboard” UI so you stay consistent and avoid random component soup.
 ## Standard choice
 - Preferred UI library: **MUI (Material UI)**.
 - Prefer MUI components over ad-hoc HTML/CSS unless there’s a good reason.
 - One design system per repo (do not mix Chakra/Ant/Bootstrap/etc.).
 ## Setup (Next.js/React)
 - Install: `@mui/material @emotion/react @emotion/styled`
 - If using icons: `@mui/icons-material`
 - If using data grid: `@mui/x-data-grid` (or pro if licensed)
 ## Theming rules
 - Define a single theme (typography, spacing, palette) and reuse everywhere.
 - Use semantic colors (primary/secondary/error/warning/success/info), not hard-coded hex everywhere.
 - Prefer MUI’s `sx` for small styling; use `styled()` for reusable components.
 ## “Portal” patterns (modals, popovers, menus)
 - Use MUI Dialog/Modal/Popover/Menu components instead of DIY portals.
 - Accessibility requirements:
  - Focus is trapped in Dialog/Modal.
  - Escape closes modal unless explicitly prevented.
  - All inputs have labels; buttons have clear text/aria-labels.
  - Keyboard navigation works end-to-end.
 ## Layout conventions (for portals)
 - Use: AppBar + Drawer (or NavigationRail equivalent) + main content.
 - Keep pages as composition of small components: Page → Sections → Widgets.
 - Keep forms consistent: FormControl + helper text + validation messages.
 ## Performance hygiene
 - Avoid re-render storms: memoize heavy lists; use virtualization for large tables (DataGrid).
 - Prefer server pagination for huge datasets.
 ## PR review checklist
 - Theme is used (no random styling).
 - Components are MUI where reasonable.
 - Modal/popover accessibility is correct.
 - No mixed UI libraries.
--- a/SKILLS/60-security-safety.md
+++ b/SKILLS/60-security-safety.md
@@ -0,0 +1,15 @@
 # Security & Safety
 ## Secrets
 - Never output secrets or tokens.
 - Never log sensitive inputs.
 - Never commit credentials.
 ## Inputs
 - Validate external inputs at boundaries.
 - Fail closed for auth/security decisions.
 ## Tooling
 - No destructive commands unless requested and scoped.
 - Prefer read-only operations first.
--- a/SKILLS/70-docs-artifacts.md
+++ b/SKILLS/70-docs-artifacts.md
@@ -0,0 +1,13 @@
 # Docs & Artifacts
 Update documentation when:
 - setup steps change
 - env vars change
 - endpoints/CLI behavior changes
 - data formats change
 Docs standards:
 - Provide copy/paste commands
 - Provide expected outputs where helpful
 - Keep it short and accurate
--- a/SKILLS/80-mcp-tools.md
+++ b/SKILLS/80-mcp-tools.md
@@ -0,0 +1,11 @@
 # MCP Tools Skill (Optional)
 If this repo defines MCP servers/tools:
 Rules:
 - Tool calls must be explicit and logged.
 - Maintain an allowlist of tools; deny by default.
 - Every tool must have: purpose, inputs/outputs schema, examples, and tests.
 - Prefer idempotent tool operations.
 - Never add tools that can exfiltrate secrets without strict guards.
--- a/SKILLS/82-mcp-server-design.md
+++ b/SKILLS/82-mcp-server-design.md
@@ -0,0 +1,51 @@
 # MCP Server Design (Agent-First)
 Build MCP servers like you’re designing a UI for a non-human user.
 This skill distills Phil Schmid’s MCP server best practices into concrete repo rules.
 Source: “MCP is Not the Problem, It’s your Server” (Jan 21, 2026).
 ## 1) Outcomes, not operations
 - Do **not** wrap REST endpoints 1:1 as tools.
 - Expose high-level, outcome-oriented tools.
  - Bad: `get_user`, `list_orders`, `get_order_status`
  - Good: `track_latest_order(email)` (server orchestrates internally)
 ## 2) Flatten arguments
 - Prefer top-level primitives + constrained enums.
 - Avoid nested `dict`/config objects (agents hallucinate keys).
 - Defaults reduce decision load.
 ## 3) Instructions are context
 - Tool docstrings are *instructions*:
  - when to use the tool
  - argument formatting rules
  - what the return means
 - Error strings are also context:
  - return actionable, self-correcting messages (not raw stack traces)
 ## 4) Curate ruthlessly
 - Aim for **5–15 tools** per server.
 - One server, one job. Split by persona if needed.
 - Delete unused tools. Don’t dump raw data into context.
 ## 5) Name tools for discovery
 - Avoid generic names (`create_issue`).
 - Prefer `{service}_{action}_{resource}`:
  - `velociraptor_run_hunt`
  - `github_list_prs`
  - `slack_send_message`
 ## 6) Paginate large results
 - Always support `limit` (default ~20–50).
 - Return metadata: `has_more`, `next_offset`, `total_count`.
 - Never return hundreds of rows unbounded.
 ## Repo conventions
 - Put MCP tool specs in `mcp/` (schemas, examples, fixtures).
 - Provide at least 1 “golden path” example call per tool.
 - Add an eval that checks:
  - tool names follow discovery convention
  - args are flat + typed
  - responses are concise + stable
  - pagination works
--- a/SKILLS/83-fastmcp-3-patterns.md
+++ b/SKILLS/83-fastmcp-3-patterns.md
@@ -0,0 +1,40 @@
 # FastMCP 3 Patterns (Providers + Transforms)
 Use this skill when you are building MCP servers in Python and want:
 - composable tool sets
 - per-user/per-session behavior
 - auth, versioning, observability, and long-running tasks
 ## Mental model (FastMCP 3)
 FastMCP 3 treats everything as three composable primitives:
 - **Components**: what you expose (tools, resources, prompts)
 - **Providers**: where components come from (decorators, files, OpenAPI, remote MCP, etc.)
 - **Transforms**: how you reshape what clients see (namespace, filters, auth, versioning, visibility)
 ## Recommended architecture for Marc’s platform
 Build a **single “Cyber MCP Gateway”** that composes providers:
 - LocalProvider: core cyber tools (run hunt, parse triage, generate report)
 - OpenAPIProvider: wrap stable internal APIs (ticketing, asset DB) without 1:1 endpoint exposure
 - ProxyProvider/FastMCPProvider: mount sub-servers (e.g., Velociraptor tools, Intel feeds)
 Then apply transforms:
 - Namespace per domain: `hunt.*`, `intel.*`, `pad.*`
 - Visibility per session: hide dangerous tools unless user/role allows
 - VersionFilter: keep old clients working while you evolve tools
 ## Production must-haves
 - **Tool timeouts**: never let a tool hang forever
 - **Pagination**: all list tools must be bounded
 - **Background tasks**: use for long hunts / ingest jobs
 - **Tracing**: emit OpenTelemetry traces so you can debug agent/tool behavior
 ## Auth rules
 - Prefer component-level auth for “dangerous” tools.
 - Default stance: read-only tools visible; write/execute tools gated.
 ## Versioning rules
 - Version your components when you change schemas or semantics.
 - Keep 1 previous version callable during migrations.
 ## Upgrade guidance
 FastMCP 3 is in beta; pin to v2 for stability in production until you’ve tested.
--- a/scripts/bootstrap_repo.ps1
+++ b/scripts/bootstrap_repo.ps1
@@ -0,0 +1,27 @@
 Param(
  [Parameter(Mandatory=$true)][string]$RepoPath
 )
 $ErrorActionPreference = "Stop"
 $RootDir = (Resolve-Path (Join-Path $PSScriptRoot "..")).Path
 $Target = (Resolve-Path $RepoPath).Path
 New-Item -ItemType Directory -Force -Path (Join-Path $Target ".claude\agents") | Out-Null
 New-Item -ItemType Directory -Force -Path (Join-Path $Target "SKILLS") | Out-Null
 Copy-Item -Force (Join-Path $RootDir "AGENTS.md") (Join-Path $Target "AGENTS.md")
 if (Test-Path (Join-Path $RootDir "SKILLS.md")) {
  Copy-Item -Force (Join-Path $RootDir "SKILLS.md") (Join-Path $Target "SKILLS.md")
 }
 Copy-Item -Recurse -Force (Join-Path $RootDir "SKILLS\*") (Join-Path $Target "SKILLS")
 Copy-Item -Recurse -Force (Join-Path $RootDir ".claude\agents\*") (Join-Path $Target ".claude\agents")
 if (-not (Test-Path (Join-Path $Target ".gitlab-ci.yml")) -and (Test-Path (Join-Path $RootDir ".gitlab-ci.yml"))) {
  Copy-Item -Force (Join-Path $RootDir ".gitlab-ci.yml") (Join-Path $Target ".gitlab-ci.yml")
 }
 if (-not (Test-Path (Join-Path $Target ".github")) -and (Test-Path (Join-Path $RootDir ".github"))) {
  Copy-Item -Recurse -Force (Join-Path $RootDir ".github") (Join-Path $Target ".github")
 }
 Write-Host "Bootstrapped repo: $Target"
 Write-Host "Next: wire DoD gates to your stack and run scripts\dod.ps1"
--- a/scripts/bootstrap_repo.sh
+++ b/scripts/bootstrap_repo.sh
@@ -0,0 +1,33 @@
 #!/usr/bin/env bash
 set -euo pipefail
 # Copy backbone files into an existing repo directory.
 # Usage: ./scripts/bootstrap_repo.sh /path/to/repo
 TARGET="${1:-}"
 if [[ -z "$TARGET" ]]; then
  echo "Usage: $0 /path/to/repo"
  exit 2
 fi
 SRC_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
 mkdir -p "$TARGET/.claude/agents"
 mkdir -p "$TARGET/SKILLS"
 # Copy minimal backbone (adjust to taste)
 cp -f "$SRC_DIR/AGENTS.md" "$TARGET/AGENTS.md"
 cp -f "$SRC_DIR/SKILLS.md" "$TARGET/SKILLS.md" || true
 cp -rf "$SRC_DIR/SKILLS/" "$TARGET/" || true
 cp -rf "$SRC_DIR/.claude/agents/" "$TARGET/.claude/agents/" || true
 # Optional: CI templates
 if [[ ! -f "$TARGET/.gitlab-ci.yml" && -f "$SRC_DIR/.gitlab-ci.yml" ]]; then
  cp -f "$SRC_DIR/.gitlab-ci.yml" "$TARGET/.gitlab-ci.yml"
 fi
 if [[ ! -d "$TARGET/.github" && -d "$SRC_DIR/.github" ]]; then
  cp -rf "$SRC_DIR/.github" "$TARGET/.github"
 fi
 echo "Bootstrapped repo: $TARGET"
 echo "Next: wire DoD gates to your stack (npm/pip) and run scripts/dod.sh"
--- a/scripts/dod.ps1
+++ b/scripts/dod.ps1
@@ -0,0 +1,42 @@
 $ErrorActionPreference = "Stop"
 Write-Host "== DoD Gate =="
 $root = Split-Path -Parent $PSScriptRoot
 Set-Location $root
 function Has-Command($name) {
  return $null -ne (Get-Command $name -ErrorAction SilentlyContinue)
 }
 $hasNode = Test-Path ".\package.json"
 $hasPy = (Test-Path ".\pyproject.toml") -or (Test-Path ".\requirements.txt") -or (Test-Path ".\requirements-dev.txt")
 if ($hasNode) {
  if (-not (Has-Command "npm")) { throw "npm not found" }
  Write-Host "+ npm ci"; npm ci
  $pkg = Get-Content ".\package.json" | ConvertFrom-Json
  if ($pkg.scripts.lint) { Write-Host "+ npm run lint"; npm run lint }
  if ($pkg.scripts.typecheck) { Write-Host "+ npm run typecheck"; npm run typecheck }
  if ($pkg.scripts.test) { Write-Host "+ npm test"; npm test }
  if ($pkg.scripts.build) { Write-Host "+ npm run build"; npm run build }
 }
 if ($hasPy) {
  if (-not (Has-Command "python")) { throw "python not found" }
  Write-Host "+ python -m pip install -U pip"; python -m pip install -U pip
  if (Test-Path ".\requirements.txt") { Write-Host "+ pip install -r requirements.txt"; pip install -r requirements.txt }
  if (Test-Path ".\requirements-dev.txt") { Write-Host "+ pip install -r requirements-dev.txt"; pip install -r requirements-dev.txt }
  if (Has-Command "ruff") {
    Write-Host "+ ruff check ."; ruff check .
    Write-Host "+ ruff format --check ."; ruff format --check .
  }
  if (Has-Command "pytest") { Write-Host "+ pytest -q"; pytest -q }
 }
 if (-not $hasNode -and -not $hasPy) {
  Write-Host "No package.json or Python dependency files detected."
  Write-Host "Customize scripts\dod.ps1 for this repo stack."
 }
 Write-Host "DoD PASS"
--- a/scripts/dod.sh
+++ b/scripts/dod.sh
@@ -0,0 +1,43 @@
 #!/usr/bin/env bash
 set -euo pipefail
 echo "== DoD Gate =="
 ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
 cd "$ROOT"
 fail() { echo "DoD FAIL: $1" >&2; exit 1; }
 run() { echo "+ $*"; "$@"; }
 has() { command -v "$1" >/dev/null 2>&1; }
 HAS_NODE=0
 HAS_PY=0
 [[ -f package.json ]] && HAS_NODE=1
 [[ -f pyproject.toml || -f requirements.txt || -f requirements-dev.txt ]] && HAS_PY=1
 if [[ $HAS_NODE -eq 1 ]]; then
  has npm || fail "npm not found"
  run npm ci
  if has jq && jq -e '.scripts.lint' package.json >/dev/null 2>&1; then run npm run lint; fi
  if has jq && jq -e '.scripts.typecheck' package.json >/dev/null 2>&1; then run npm run typecheck; fi
  if has jq && jq -e '.scripts.test' package.json >/dev/null 2>&1; then run npm test; fi
  if has jq && jq -e '.scripts.build' package.json >/dev/null 2>&1; then run npm run build; fi
 fi
 if [[ $HAS_PY -eq 1 ]]; then
  has python3 || fail "python3 not found"
  run python3 -m pip install -U pip
  if [[ -f requirements.txt ]]; then run python3 -m pip install -r requirements.txt; fi
  if [[ -f requirements-dev.txt ]]; then run python3 -m pip install -r requirements-dev.txt; fi
  if has ruff; then
    run ruff check . || true
    run ruff format --check . || true
  fi
  if has pytest; then run pytest -q || true; fi
 fi
 if [[ $HAS_NODE -eq 0 && $HAS_PY -eq 0 ]]; then
  echo "No package.json or Python dependency files detected."
  echo "Customize scripts/dod.sh for this repo stack."
 fi
 echo "DoD PASS"
--- a/scripts/monday.ps1
+++ b/scripts/monday.ps1
@@ -0,0 +1,56 @@
 Param(
  [Parameter(Mandatory=$false)][string]$Command = "status",
  [Parameter(Mandatory=$false)][string]$RepoPath = ""
 )
 $ErrorActionPreference = "Stop"
 $RootDir = (Resolve-Path (Join-Path $PSScriptRoot "..")).Path
 Write-Host "== Dev Backbone Monday Runner =="
 function Need-Cmd($name) {
  if (-not (Get-Command $name -ErrorAction SilentlyContinue)) {
    throw "Missing command: $name"
  }
 }
 switch ($Command) {
  "status" {
    $code = (Get-Command code -ErrorAction SilentlyContinue)
    $git = (Get-Command git -ErrorAction SilentlyContinue)
    $docker = (Get-Command docker -ErrorAction SilentlyContinue)
    Write-Host "[1] VS Code CLI:" ($code.Source ?? "NOT FOUND")
    Write-Host "[2] Git:       " ($git.Source ?? "NOT FOUND")
    Write-Host "[3] Docker:    " ($docker.Source ?? "NOT FOUND")
    Write-Host ""
    Write-Host "Profiles expected: Dev, Cyber, Infra"
    Write-Host "Try: code --list-extensions --profile Dev"
  }
  "vscode-purge" {
    Need-Cmd code
    if ($env:CONFIRM -ne "YES") {
      Write-Host "Refusing to uninstall extensions without CONFIRM=YES"
      Write-Host "Run: `$env:CONFIRM='YES'; .\scripts\monday.ps1 -Command vscode-purge"
      exit 2
    }
    & (Join-Path $RootDir "scripts\vscode_profiles.ps1") -Action purge
  }
  "vscode-install" {
    Need-Cmd code
    & (Join-Path $RootDir "scripts\vscode_profiles.ps1") -Action install
  }
  "repo-bootstrap" {
    if ([string]::IsNullOrWhiteSpace($RepoPath)) {
      throw "Usage: .\scripts\monday.ps1 -Command repo-bootstrap -RepoPath C:\path\to\repo"
    }
    & (Join-Path $RootDir "scripts\bootstrap_repo.ps1") -RepoPath $RepoPath
  }
  default {
    throw "Unknown command: $Command"
  }
 }
--- a/scripts/monday.sh
+++ b/scripts/monday.sh
@@ -0,0 +1,66 @@
 #!/usr/bin/env bash
 set -euo pipefail
 # Monday Overhaul Runner (safe by default)
 # Usage:
 #   ./scripts/monday.sh status
 #   ./scripts/monday.sh vscode-purge   (requires CONFIRM=YES)
 #   ./scripts/monday.sh vscode-install
 #   ./scripts/monday.sh repo-bootstrap /path/to/repo
 #
 # Notes:
 # - VS Code profile creation is easiest once via UI (Profiles: Create Profile).
 #   This script assumes profiles exist: Dev, Cyber, Infra.
 ROOT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
 echo "== Dev Backbone Monday Runner =="
 echo "Repo: $ROOT_DIR"
 echo
 cmd="${1:-status}"
 shift || true
 need_cmd() {
  command -v "$1" >/dev/null 2>&1 || { echo "Missing command: $1"; exit 1; }
 }
 case "$cmd" in
  status)
    echo "[1] VS Code CLI: $(command -v code || echo 'NOT FOUND')"
    echo "[2] Git:       $(command -v git || echo 'NOT FOUND')"
    echo "[3] Docker:    $(command -v docker || echo 'NOT FOUND')"
    echo
    echo "Profiles expected: Dev, Cyber, Infra"
    echo "Try: code --list-extensions --profile Dev"
    ;;
  vscode-purge)
    need_cmd code
    if [[ "${CONFIRM:-NO}" != "YES" ]]; then
      echo "Refusing to uninstall extensions without CONFIRM=YES"
      echo "Run: CONFIRM=YES ./scripts/monday.sh vscode-purge"
      exit 2
    fi
    bash "$ROOT_DIR/scripts/vscode_profiles.sh" purge
    ;;
  vscode-install)
    need_cmd code
    bash "$ROOT_DIR/scripts/vscode_profiles.sh" install
    ;;
  repo-bootstrap)
    repo_path="${1:-}"
    if [[ -z "$repo_path" ]]; then
      echo "Usage: ./scripts/monday.sh repo-bootstrap /path/to/repo"
      exit 2
    fi
    bash "$ROOT_DIR/scripts/bootstrap_repo.sh" "$repo_path"
    ;;
  *)
    echo "Unknown command: $cmd"
    exit 2
    ;;
 esac
--- a/scripts/vscode_profiles.ps1
+++ b/scripts/vscode_profiles.ps1
@@ -0,0 +1,75 @@
 Param(
  [Parameter(Mandatory=$true)][ValidateSet("purge","install")][string]$Action
 )
 $ErrorActionPreference = "Stop"
 function Profile-Exists([string]$ProfileName) {
  try {
    & code --list-extensions --profile $ProfileName | Out-Null
    return $true
  } catch {
    return $false
  }
 }
 # Curated extension sets (edit to taste)
 $DevExt = @(
  "GitHub.copilot",
  "GitHub.copilot-chat",
  "GitHub.vscode-pull-request-github",
  "dbaeumer.vscode-eslint",
  "esbenp.prettier-vscode",
  "ms-python.python",
  "ms-python.vscode-pylance",
  "ms-azuretools.vscode-docker",
  "ms-vscode-remote.remote-ssh",
  "ms-vscode-remote.remote-containers",
  "redhat.vscode-yaml",
  "yzhang.markdown-all-in-one"
 )
 $CyberExt = @($DevExt) # add more only if needed
 $InfraExt = @(
  "ms-azuretools.vscode-docker",
  "ms-vscode-remote.remote-ssh",
  "redhat.vscode-yaml",
  "yzhang.markdown-all-in-one"
 )
 function Purge-Profile([string]$ProfileName) {
  Write-Host "Purging extensions from profile: $ProfileName"
  if (-not (Profile-Exists $ProfileName)) {
    Write-Host "Profile not found: $ProfileName (create once via UI: Profiles: Create Profile)"
    return
  }
  $exts = & code --list-extensions --profile $ProfileName
  foreach ($ext in $exts) {
    if ([string]::IsNullOrWhiteSpace($ext)) { continue }
    & code --profile $ProfileName --uninstall-extension $ext | Out-Null
  }
 }
 function Install-Profile([string]$ProfileName, [string[]]$Extensions) {
  Write-Host "Installing extensions into profile: $ProfileName"
  if (-not (Profile-Exists $ProfileName)) {
    Write-Host "Profile not found: $ProfileName (create once via UI: Profiles: Create Profile)"
    return
  }
  foreach ($ext in $Extensions) {
    & code --profile $ProfileName --install-extension $ext | Out-Null
  }
 }
 switch ($Action) {
  "purge" {
    Purge-Profile "Dev"
    Purge-Profile "Cyber"
    Purge-Profile "Infra"
  }
  "install" {
    Install-Profile "Dev" $DevExt
    Install-Profile "Cyber" $CyberExt
    Install-Profile "Infra" $InfraExt
  }
 }
--- a/scripts/vscode_profiles.sh
+++ b/scripts/vscode_profiles.sh
@@ -0,0 +1,93 @@
 #!/usr/bin/env bash
 set -euo pipefail
 # Manage extensions per VS Code profile.
 # Requires profiles to exist: Dev, Cyber, Infra
 # Actions:
 #   ./scripts/vscode_profiles.sh purge   (uninstall ALL extensions from those profiles)
 #   ./scripts/vscode_profiles.sh install (install curated sets)
 ACTION="${1:-}"
 if [[ -z "$ACTION" ]]; then
  echo "Usage: $0 {purge|install}"
  exit 2
 fi
 need() { command -v "$1" >/dev/null 2>&1 || { echo "Missing: $1"; exit 1; }; }
 need code
 # Curated extension sets (edit to taste)
 DEV_EXT=(
  "GitHub.copilot"
  "GitHub.copilot-chat"
  "GitHub.vscode-pull-request-github"
  "dbaeumer.vscode-eslint"
  "esbenp.prettier-vscode"
  "ms-python.python"
  "ms-python.vscode-pylance"
  "ms-azuretools.vscode-docker"
  "ms-vscode-remote.remote-ssh"
  "ms-vscode-remote.remote-containers"
  "redhat.vscode-yaml"
  "yzhang.markdown-all-in-one"
 )
 CYBER_EXT=(
  "${DEV_EXT[@]}"
  # Add only if you truly use them:
  # "ms-kubernetes-tools.vscode-kubernetes-tools"
 )
 INFRA_EXT=(
  "ms-azuretools.vscode-docker"
  "ms-vscode-remote.remote-ssh"
  "redhat.vscode-yaml"
  "yzhang.markdown-all-in-one"
  # Optional:
  # "hashicorp.terraform"
 )
 purge_profile() {
  local profile="$1"
  echo "Purging extensions from profile: $profile"
  # list may fail if profile doesn't exist
  if ! code --list-extensions --profile "$profile" >/dev/null 2>&1; then
    echo "Profile not found: $profile (create once via UI: Profiles: Create Profile)"
    return 0
  fi
  code --list-extensions --profile "$profile" | while read -r ext; do
    [[ -z "$ext" ]] && continue
    code --profile "$profile" --uninstall-extension "$ext" || true
  done
 }
 install_profile() {
  local profile="$1"; shift
  local exts=("$@")
  echo "Installing extensions into profile: $profile"
  if ! code --list-extensions --profile "$profile" >/dev/null 2>&1; then
    echo "Profile not found: $profile (create once via UI: Profiles: Create Profile)"
    return 0
  fi
  for ext in "${exts[@]}"; do
    [[ "$ext" =~ ^# ]] && continue
    code --profile "$profile" --install-extension "$ext"
  done
 }
 case "$ACTION" in
  purge)
    purge_profile "Dev"
    purge_profile "Cyber"
    purge_profile "Infra"
    ;;
  install)
    install_profile "Dev" "${DEV_EXT[@]}"
    install_profile "Cyber" "${CYBER_EXT[@]}"
    install_profile "Infra" "${INFRA_EXT[@]}"
    ;;
  *)
    echo "Unknown action: $ACTION"
    exit 2
    ;;
 esac
--- a/skills_deepagents/README.md
+++ b/skills_deepagents/README.md
@@ -0,0 +1,7 @@
 # Deep Agents Skills (Optional)
 These are compatible with LangChain Deep Agents "skills" folder conventions.
 If you use Deep Agents, point your agent runtime at this folder.
 Each skill folder contains a SKILL.md and optional resources/scripts.
--- a/skills_deepagents/planner/SKILL.md
+++ b/skills_deepagents/planner/SKILL.md
@@ -0,0 +1,14 @@
 ---
 title: Planner
 when_to_use: Use for new features, refactors, multi-file changes, or unclear requirements.
 ---
 Output a plan with:
 - Goal
 - Constraints/assumptions
 - Steps (≤10)
 - Files to touch
 - Test plan
 - Risks
 - Rollback plan (if relevant)
--- a/skills_deepagents/reviewer/SKILL.md
+++ b/skills_deepagents/reviewer/SKILL.md
@@ -0,0 +1,18 @@
 ---
 title: PR Reviewer
 when_to_use: Use before merge, for risky diffs, repeated test failures, or security-sensitive changes.
 ---
 You are a strict code reviewer. Output MUST follow this structure:
 1) Summary (2-5 bullets)
 2) Must-fix (blocking issues with file+line references if possible)
 3) Nice-to-have (non-blocking improvements)
 4) Risks (what could go wrong in prod)
 5) Verification (exact commands to run + expected outcomes)
 Rules:
 - If anything is ambiguous, request failing output/logs.
 - Prefer minimal fixes over rewrites.
 - Ensure DoD will pass.
--- a/skills_deepagents/uiux/SKILL.md
+++ b/skills_deepagents/uiux/SKILL.md
@@ -0,0 +1,12 @@
 ---
 title: UI/UX Specialist
 when_to_use: Use for UI layout, component structure, accessibility, UX copy and flows.
 ---
 Output:
 - Component outline (hierarchy and props)
 - State/events model
 - UX copy suggestions (succinct)
 - Accessibility checklist (keyboard, labels, focus, contrast)
 - Acceptance criteria