Initial commit with dev backbone template

2026-03-01 06:00:21 -05:00 · 2026-02-10 16:36:30 -05:00
commit 4318c8f642
53 changed files with 3500 additions and 0 deletions
--- a/SKILLS/00-operating-model.md
+++ b/SKILLS/00-operating-model.md
@@ -0,0 +1,21 @@
+
+# Operating Model
+
+## Default cadence
+- Prefer iterative progress over big bangs.
+- Keep diffs small: target ≤ 300 changed lines per PR unless justified.
+- Update tests/docs as part of the same change when possible.
+
+## Working agreement
+- Start with a PLAN for non-trivial tasks.
+- Implement the smallest slice that satisfies acceptance criteria.
+- Verify via DoD.
+- Write a crisp PR summary: what changed, why, and how verified.
+
+## Stop conditions (plan first)
+Stop and produce a PLAN (do not code yet) if:
+- scope is unclear
+- more than 3 files will change
+- data model changes
+- auth/security boundaries
+- performance-critical paths
--- a/SKILLS/05-agent-taxonomy.md
+++ b/SKILLS/05-agent-taxonomy.md
@@ -0,0 +1,36 @@
+# Agent Types & Roles (Practical Taxonomy)
+
+Use this skill to choose the *right* kind of agent workflow for the job.
+
+## Common agent “types” (in practice)
+
+### 1) Chat assistant (no tools)
+Best for: explanations, brainstorming, small edits.
+Risk: can hallucinate; no grounding in repo state.
+
+### 2) Tool-using single agent
+Best for: well-scoped tasks where the agent can read/write files and run commands.
+Key control: strict DoD gates + minimal permissions.
+
+### 3) Planner + Executor (2-role pattern)
+Best for: medium complexity work (multi-file changes, feature work).
+Flow: Planner writes plan + acceptance criteria → Executor implements → Reviewer checks.
+
+### 4) Multi-agent (specialists)
+Best for: bigger features with separable workstreams (UI, backend, docs, tests).
+Rule: isolate context per role; use separate branches/worktrees.
+
+### 5) Supervisor / orchestrator
+Best for: long-running workflows with checkpoints (pipelines, report generation, PAD docs).
+Rule: supervisor delegates, enforces gates, and composes final output.
+
+## Decision rules (fast)
+- If you can describe it in ≤ 5 steps → single tool-using agent.
+- If you need tradeoffs/design → Planner + Executor.
+- If UI + backend + docs/tests all move → multi-agent specialists.
+- If it’s a pipeline that runs repeatedly → orchestrator.
+
+## Guardrails (always)
+- DoD is the truth gate.
+- Separate branches/worktrees for parallel work.
+- Log decisions + commands in AGENT_LOG.md.
--- a/SKILLS/10-definition-of-done.md
+++ b/SKILLS/10-definition-of-done.md
@@ -0,0 +1,24 @@
+
+# Definition of Done (DoD)
+
+A change is "done" only when:
+
+## Code correctness
+- Builds successfully (if applicable)
+- Tests pass
+- Linting/formatting passes
+- Types/checks pass (if applicable)
+
+## Quality
+- No new warnings introduced
+- Edge cases handled (inputs validated, errors meaningful)
+- Hot paths not regressed (if applicable)
+
+## Hygiene
+- No secrets committed
+- Docs updated if behavior or usage changed
+- PR summary includes verification steps
+
+## Commands
+- macOS/Linux: `./scripts/dod.sh`
+- Windows: `\scripts\dod.ps1`
--- a/SKILLS/20-repo-map.md
+++ b/SKILLS/20-repo-map.md
@@ -0,0 +1,16 @@
+
+# Repo Mapping Skill
+
+When entering a repo:
+1) Read README.md
+2) Identify entrypoints (app main / server startup / CLI)
+3) Identify config (env vars, .env.example, config files)
+4) Identify test/lint scripts (package.json, pyproject.toml, Makefile, etc.)
+5) Write a 10-line "repo map" in the PLAN before changing code
+
+Output format:
+- Purpose:
+- Key modules:
+- Data flow:
+- Commands:
+- Risks:
--- a/SKILLS/25-algorithms-performance.md
+++ b/SKILLS/25-algorithms-performance.md
@@ -0,0 +1,20 @@
+# Algorithms & Performance
+
+Use this skill when performance matters (large inputs, hot paths, or repeated calls).
+
+## Checklist
+- Identify the **state** you’re recomputing.
+- Add **memoization / caching** when the same subproblem repeats.
+- Prefer **linear scans** + caches over nested loops when possible.
+- If you can write it as a **recurrence**, you can test it.
+
+## Practical heuristics
+- Measure first when possible (timing + input sizes).
+- Optimize the biggest wins: avoid repeated I/O, repeated parsing, repeated network calls.
+- Keep caches bounded (size/TTL) and invalidate safely.
+- Choose data structures intentionally: dict/set for membership, heap for top-k, deque for queues.
+
+## Review notes (for PRs)
+- Call out accidental O(n²) patterns.
+- Suggest table/DP or memoization when repeated work is obvious.
+- Add tests that cover base cases + typical cases + worst-case size.
--- a/SKILLS/26-vibe-coding-fundamentals.md
+++ b/SKILLS/26-vibe-coding-fundamentals.md
@@ -0,0 +1,31 @@
+# Vibe Coding With Fundamentals (Safety Rails)
+
+Use this skill when you’re using “vibe coding” (fast, conversational building) but want production-grade outcomes.
+
+## The good
+- Rapid scaffolding and iteration
+- Fast UI prototypes
+- Quick exploration of architectures and options
+
+## The failure mode
+- “It works on my machine” code with weak tests
+- Security foot-guns (auth, input validation, secrets)
+- Performance cliffs (accidental O(n²), repeated I/O)
+- Unmaintainable abstractions
+
+## Safety rails (apply every time)
+- Always start with acceptance criteria (what “done” means).
+- Prefer small PRs; never dump a huge AI diff.
+- Require DoD gates (lint/test/build) before merge.
+- Write tests for behavior changes.
+- For anything security/data related: do a Reviewer pass.
+
+## When to slow down
+- Auth/session/token work
+- Anything touching payments, PII, secrets
+- Data migrations/schema changes
+- Performance-critical paths
+- “It’s flaky” or “it only fails in CI”
+
+## Practical prompt pattern (use in PLAN)
+- “State assumptions, list files to touch, propose tests, and include rollback steps.”
--- a/SKILLS/27-performance-profiling.md
+++ b/SKILLS/27-performance-profiling.md
@@ -0,0 +1,31 @@
+# Performance Profiling (Bun/Node)
+
+Use this skill when:
+- a hot path feels slow
+- CPU usage is high
+- you suspect accidental O(n²) or repeated work
+- you need evidence before optimizing
+
+## Bun CPU profiling
+Bun supports CPU profiling via `--cpu-prof` (generates a `.cpuprofile` you can open in Chrome DevTools).
+
+Upcoming: `bun --cpu-prof-md <script>` outputs a CPU profile as **Markdown** so LLMs can read/grep it easily.
+
+### Workflow (Bun)
+1) Run the workload with profiling enabled
+   - Today: `bun --cpu-prof ./path/to/script.ts`
+   - Upcoming: `bun --cpu-prof-md ./path/to/script.ts`
+2) Save the output (or `.cpuprofile`) into `./profiles/` with a timestamp.
+3) Ask the Reviewer agent to:
+   - identify the top 5 hottest functions
+   - propose the smallest fix
+   - add a regression test or benchmark
+
+## Node CPU profiling (fallback)
+- `node --cpu-prof ./script.js` writes a `.cpuprofile` file.
+- Open in Chrome DevTools → Performance → Load profile.
+
+## Rules
+- Optimize based on measured hotspots, not vibes.
+- Prefer algorithmic wins (remove repeated work) over micro-optimizations.
+- Keep profiling artifacts out of git unless explicitly needed (use `.gitignore`).
--- a/SKILLS/30-implementation-rules.md
+++ b/SKILLS/30-implementation-rules.md
@@ -0,0 +1,16 @@
+
+# Implementation Rules
+
+## Change policy
+- Prefer edits over rewrites.
+- Keep changes localized.
+- One change = one purpose.
+- Avoid unnecessary abstraction.
+
+## Dependency policy
+- Default: do not add dependencies.
+- If adding: explain why, alternatives considered, and impact.
+
+## Error handling
+- Validate inputs at boundaries.
+- Error messages must be actionable: what failed + what to do next.
--- a/SKILLS/40-testing-quality.md
+++ b/SKILLS/40-testing-quality.md
@@ -0,0 +1,14 @@
+
+# Testing & Quality
+
+## Strategy
+- If behavior changes: add/update tests.
+- Unit tests for logic; integration tests for boundaries; E2E only where needed.
+
+## Minimum for every PR
+- A test plan in the PR summary (even if “existing tests cover this”).
+- Run DoD.
+
+## Flaky tests
+- Capture repro steps.
+- Quarantine only with justification + follow-up issue.
--- a/SKILLS/50-pr-review.md
+++ b/SKILLS/50-pr-review.md
@@ -0,0 +1,16 @@
+
+# PR Review Skill
+
+Reviewer must check:
+- Correctness: does it do what it claims?
+- Safety: secrets, injection, auth boundaries
+- Maintainability: readability, naming, duplication
+- Tests: added/updated appropriately
+- DoD: did it pass?
+
+Reviewer output format:
+1) Summary
+2) Must-fix
+3) Nice-to-have
+4) Risks
+5) Verification suggestions
--- a/SKILLS/56-ui-material-ui.md
+++ b/SKILLS/56-ui-material-ui.md
@@ -0,0 +1,41 @@
+# Material UI (MUI) Design System
+
+Use this skill for any React/Next “portal/admin/dashboard” UI so you stay consistent and avoid random component soup.
+
+## Standard choice
+- Preferred UI library: **MUI (Material UI)**.
+- Prefer MUI components over ad-hoc HTML/CSS unless there’s a good reason.
+- One design system per repo (do not mix Chakra/Ant/Bootstrap/etc.).
+
+## Setup (Next.js/React)
+- Install: `@mui/material @emotion/react @emotion/styled`
+- If using icons: `@mui/icons-material`
+- If using data grid: `@mui/x-data-grid` (or pro if licensed)
+
+## Theming rules
+- Define a single theme (typography, spacing, palette) and reuse everywhere.
+- Use semantic colors (primary/secondary/error/warning/success/info), not hard-coded hex everywhere.
+- Prefer MUI’s `sx` for small styling; use `styled()` for reusable components.
+
+## “Portal” patterns (modals, popovers, menus)
+- Use MUI Dialog/Modal/Popover/Menu components instead of DIY portals.
+- Accessibility requirements:
+  - Focus is trapped in Dialog/Modal.
+  - Escape closes modal unless explicitly prevented.
+  - All inputs have labels; buttons have clear text/aria-labels.
+  - Keyboard navigation works end-to-end.
+
+## Layout conventions (for portals)
+- Use: AppBar + Drawer (or NavigationRail equivalent) + main content.
+- Keep pages as composition of small components: Page → Sections → Widgets.
+- Keep forms consistent: FormControl + helper text + validation messages.
+
+## Performance hygiene
+- Avoid re-render storms: memoize heavy lists; use virtualization for large tables (DataGrid).
+- Prefer server pagination for huge datasets.
+
+## PR review checklist
+- Theme is used (no random styling).
+- Components are MUI where reasonable.
+- Modal/popover accessibility is correct.
+- No mixed UI libraries.
--- a/SKILLS/60-security-safety.md
+++ b/SKILLS/60-security-safety.md
@@ -0,0 +1,15 @@
+
+# Security & Safety
+
+## Secrets
+- Never output secrets or tokens.
+- Never log sensitive inputs.
+- Never commit credentials.
+
+## Inputs
+- Validate external inputs at boundaries.
+- Fail closed for auth/security decisions.
+
+## Tooling
+- No destructive commands unless requested and scoped.
+- Prefer read-only operations first.
--- a/SKILLS/70-docs-artifacts.md
+++ b/SKILLS/70-docs-artifacts.md
@@ -0,0 +1,13 @@
+
+# Docs & Artifacts
+
+Update documentation when:
+- setup steps change
+- env vars change
+- endpoints/CLI behavior changes
+- data formats change
+
+Docs standards:
+- Provide copy/paste commands
+- Provide expected outputs where helpful
+- Keep it short and accurate
--- a/SKILLS/80-mcp-tools.md
+++ b/SKILLS/80-mcp-tools.md
@@ -0,0 +1,11 @@
+
+# MCP Tools Skill (Optional)
+
+If this repo defines MCP servers/tools:
+
+Rules:
+- Tool calls must be explicit and logged.
+- Maintain an allowlist of tools; deny by default.
+- Every tool must have: purpose, inputs/outputs schema, examples, and tests.
+- Prefer idempotent tool operations.
+- Never add tools that can exfiltrate secrets without strict guards.
--- a/SKILLS/82-mcp-server-design.md
+++ b/SKILLS/82-mcp-server-design.md
@@ -0,0 +1,51 @@
+# MCP Server Design (Agent-First)
+
+Build MCP servers like you’re designing a UI for a non-human user.
+
+This skill distills Phil Schmid’s MCP server best practices into concrete repo rules.
+Source: “MCP is Not the Problem, It’s your Server” (Jan 21, 2026).
+
+## 1) Outcomes, not operations
+- Do **not** wrap REST endpoints 1:1 as tools.
+- Expose high-level, outcome-oriented tools.
+  - Bad: `get_user`, `list_orders`, `get_order_status`
+  - Good: `track_latest_order(email)` (server orchestrates internally)
+
+## 2) Flatten arguments
+- Prefer top-level primitives + constrained enums.
+- Avoid nested `dict`/config objects (agents hallucinate keys).
+- Defaults reduce decision load.
+
+## 3) Instructions are context
+- Tool docstrings are *instructions*:
+  - when to use the tool
+  - argument formatting rules
+  - what the return means
+- Error strings are also context:
+  - return actionable, self-correcting messages (not raw stack traces)
+
+## 4) Curate ruthlessly
+- Aim for **5–15 tools** per server.
+- One server, one job. Split by persona if needed.
+- Delete unused tools. Don’t dump raw data into context.
+
+## 5) Name tools for discovery
+- Avoid generic names (`create_issue`).
+- Prefer `{service}_{action}_{resource}`:
+  - `velociraptor_run_hunt`
+  - `github_list_prs`
+  - `slack_send_message`
+
+## 6) Paginate large results
+- Always support `limit` (default ~20–50).
+- Return metadata: `has_more`, `next_offset`, `total_count`.
+- Never return hundreds of rows unbounded.
+
+## Repo conventions
+- Put MCP tool specs in `mcp/` (schemas, examples, fixtures).
+- Provide at least 1 “golden path” example call per tool.
+- Add an eval that checks:
+  - tool names follow discovery convention
+  - args are flat + typed
+  - responses are concise + stable
+  - pagination works
--- a/SKILLS/83-fastmcp-3-patterns.md
+++ b/SKILLS/83-fastmcp-3-patterns.md
@@ -0,0 +1,40 @@
+# FastMCP 3 Patterns (Providers + Transforms)
+
+Use this skill when you are building MCP servers in Python and want:
+- composable tool sets
+- per-user/per-session behavior
+- auth, versioning, observability, and long-running tasks
+
+## Mental model (FastMCP 3)
+FastMCP 3 treats everything as three composable primitives:
+- **Components**: what you expose (tools, resources, prompts)
+- **Providers**: where components come from (decorators, files, OpenAPI, remote MCP, etc.)
+- **Transforms**: how you reshape what clients see (namespace, filters, auth, versioning, visibility)
+
+## Recommended architecture for Marc’s platform
+Build a **single “Cyber MCP Gateway”** that composes providers:
+- LocalProvider: core cyber tools (run hunt, parse triage, generate report)
+- OpenAPIProvider: wrap stable internal APIs (ticketing, asset DB) without 1:1 endpoint exposure
+- ProxyProvider/FastMCPProvider: mount sub-servers (e.g., Velociraptor tools, Intel feeds)
+
+Then apply transforms:
+- Namespace per domain: `hunt.*`, `intel.*`, `pad.*`
+- Visibility per session: hide dangerous tools unless user/role allows
+- VersionFilter: keep old clients working while you evolve tools
+
+## Production must-haves
+- **Tool timeouts**: never let a tool hang forever
+- **Pagination**: all list tools must be bounded
+- **Background tasks**: use for long hunts / ingest jobs
+- **Tracing**: emit OpenTelemetry traces so you can debug agent/tool behavior
+
+## Auth rules
+- Prefer component-level auth for “dangerous” tools.
+- Default stance: read-only tools visible; write/execute tools gated.
+
+## Versioning rules
+- Version your components when you change schemas or semantics.
+- Keep 1 previous version callable during migrations.
+
+## Upgrade guidance
+FastMCP 3 is in beta; pin to v2 for stability in production until you’ve tested.