Initial commit with dev backbone template

This commit is contained in:
2026-02-10 16:36:30 -05:00
commit 4318c8f642
53 changed files with 3500 additions and 0 deletions

View File

@@ -0,0 +1,21 @@
# Operating Model
## Default cadence
- Prefer iterative progress over big bangs.
- Keep diffs small: target ≤ 300 changed lines per PR unless justified.
- Update tests/docs as part of the same change when possible.
## Working agreement
- Start with a PLAN for non-trivial tasks.
- Implement the smallest slice that satisfies acceptance criteria.
- Verify via DoD.
- Write a crisp PR summary: what changed, why, and how verified.
## Stop conditions (plan first)
Stop and produce a PLAN (do not code yet) if:
- scope is unclear
- more than 3 files will change
- data model changes
- auth/security boundaries
- performance-critical paths

View File

@@ -0,0 +1,36 @@
# Agent Types & Roles (Practical Taxonomy)
Use this skill to choose the *right* kind of agent workflow for the job.
## Common agent “types” (in practice)
### 1) Chat assistant (no tools)
Best for: explanations, brainstorming, small edits.
Risk: can hallucinate; no grounding in repo state.
### 2) Tool-using single agent
Best for: well-scoped tasks where the agent can read/write files and run commands.
Key control: strict DoD gates + minimal permissions.
### 3) Planner + Executor (2-role pattern)
Best for: medium complexity work (multi-file changes, feature work).
Flow: Planner writes plan + acceptance criteria → Executor implements → Reviewer checks.
### 4) Multi-agent (specialists)
Best for: bigger features with separable workstreams (UI, backend, docs, tests).
Rule: isolate context per role; use separate branches/worktrees.
### 5) Supervisor / orchestrator
Best for: long-running workflows with checkpoints (pipelines, report generation, PAD docs).
Rule: supervisor delegates, enforces gates, and composes final output.
## Decision rules (fast)
- If you can describe it in ≤ 5 steps → single tool-using agent.
- If you need tradeoffs/design → Planner + Executor.
- If UI + backend + docs/tests all move → multi-agent specialists.
- If its a pipeline that runs repeatedly → orchestrator.
## Guardrails (always)
- DoD is the truth gate.
- Separate branches/worktrees for parallel work.
- Log decisions + commands in AGENT_LOG.md.

View File

@@ -0,0 +1,24 @@
# Definition of Done (DoD)
A change is "done" only when:
## Code correctness
- Builds successfully (if applicable)
- Tests pass
- Linting/formatting passes
- Types/checks pass (if applicable)
## Quality
- No new warnings introduced
- Edge cases handled (inputs validated, errors meaningful)
- Hot paths not regressed (if applicable)
## Hygiene
- No secrets committed
- Docs updated if behavior or usage changed
- PR summary includes verification steps
## Commands
- macOS/Linux: `./scripts/dod.sh`
- Windows: `\scripts\dod.ps1`

16
SKILLS/20-repo-map.md Normal file
View File

@@ -0,0 +1,16 @@
# Repo Mapping Skill
When entering a repo:
1) Read README.md
2) Identify entrypoints (app main / server startup / CLI)
3) Identify config (env vars, .env.example, config files)
4) Identify test/lint scripts (package.json, pyproject.toml, Makefile, etc.)
5) Write a 10-line "repo map" in the PLAN before changing code
Output format:
- Purpose:
- Key modules:
- Data flow:
- Commands:
- Risks:

View File

@@ -0,0 +1,20 @@
# Algorithms & Performance
Use this skill when performance matters (large inputs, hot paths, or repeated calls).
## Checklist
- Identify the **state** youre recomputing.
- Add **memoization / caching** when the same subproblem repeats.
- Prefer **linear scans** + caches over nested loops when possible.
- If you can write it as a **recurrence**, you can test it.
## Practical heuristics
- Measure first when possible (timing + input sizes).
- Optimize the biggest wins: avoid repeated I/O, repeated parsing, repeated network calls.
- Keep caches bounded (size/TTL) and invalidate safely.
- Choose data structures intentionally: dict/set for membership, heap for top-k, deque for queues.
## Review notes (for PRs)
- Call out accidental O(n²) patterns.
- Suggest table/DP or memoization when repeated work is obvious.
- Add tests that cover base cases + typical cases + worst-case size.

View File

@@ -0,0 +1,31 @@
# Vibe Coding With Fundamentals (Safety Rails)
Use this skill when youre using “vibe coding” (fast, conversational building) but want production-grade outcomes.
## The good
- Rapid scaffolding and iteration
- Fast UI prototypes
- Quick exploration of architectures and options
## The failure mode
- “It works on my machine” code with weak tests
- Security foot-guns (auth, input validation, secrets)
- Performance cliffs (accidental O(n²), repeated I/O)
- Unmaintainable abstractions
## Safety rails (apply every time)
- Always start with acceptance criteria (what “done” means).
- Prefer small PRs; never dump a huge AI diff.
- Require DoD gates (lint/test/build) before merge.
- Write tests for behavior changes.
- For anything security/data related: do a Reviewer pass.
## When to slow down
- Auth/session/token work
- Anything touching payments, PII, secrets
- Data migrations/schema changes
- Performance-critical paths
- “Its flaky” or “it only fails in CI”
## Practical prompt pattern (use in PLAN)
- “State assumptions, list files to touch, propose tests, and include rollback steps.”

View File

@@ -0,0 +1,31 @@
# Performance Profiling (Bun/Node)
Use this skill when:
- a hot path feels slow
- CPU usage is high
- you suspect accidental O(n²) or repeated work
- you need evidence before optimizing
## Bun CPU profiling
Bun supports CPU profiling via `--cpu-prof` (generates a `.cpuprofile` you can open in Chrome DevTools).
Upcoming: `bun --cpu-prof-md <script>` outputs a CPU profile as **Markdown** so LLMs can read/grep it easily.
### Workflow (Bun)
1) Run the workload with profiling enabled
- Today: `bun --cpu-prof ./path/to/script.ts`
- Upcoming: `bun --cpu-prof-md ./path/to/script.ts`
2) Save the output (or `.cpuprofile`) into `./profiles/` with a timestamp.
3) Ask the Reviewer agent to:
- identify the top 5 hottest functions
- propose the smallest fix
- add a regression test or benchmark
## Node CPU profiling (fallback)
- `node --cpu-prof ./script.js` writes a `.cpuprofile` file.
- Open in Chrome DevTools → Performance → Load profile.
## Rules
- Optimize based on measured hotspots, not vibes.
- Prefer algorithmic wins (remove repeated work) over micro-optimizations.
- Keep profiling artifacts out of git unless explicitly needed (use `.gitignore`).

View File

@@ -0,0 +1,16 @@
# Implementation Rules
## Change policy
- Prefer edits over rewrites.
- Keep changes localized.
- One change = one purpose.
- Avoid unnecessary abstraction.
## Dependency policy
- Default: do not add dependencies.
- If adding: explain why, alternatives considered, and impact.
## Error handling
- Validate inputs at boundaries.
- Error messages must be actionable: what failed + what to do next.

View File

@@ -0,0 +1,14 @@
# Testing & Quality
## Strategy
- If behavior changes: add/update tests.
- Unit tests for logic; integration tests for boundaries; E2E only where needed.
## Minimum for every PR
- A test plan in the PR summary (even if “existing tests cover this”).
- Run DoD.
## Flaky tests
- Capture repro steps.
- Quarantine only with justification + follow-up issue.

16
SKILLS/50-pr-review.md Normal file
View File

@@ -0,0 +1,16 @@
# PR Review Skill
Reviewer must check:
- Correctness: does it do what it claims?
- Safety: secrets, injection, auth boundaries
- Maintainability: readability, naming, duplication
- Tests: added/updated appropriately
- DoD: did it pass?
Reviewer output format:
1) Summary
2) Must-fix
3) Nice-to-have
4) Risks
5) Verification suggestions

View File

@@ -0,0 +1,41 @@
# Material UI (MUI) Design System
Use this skill for any React/Next “portal/admin/dashboard” UI so you stay consistent and avoid random component soup.
## Standard choice
- Preferred UI library: **MUI (Material UI)**.
- Prefer MUI components over ad-hoc HTML/CSS unless theres a good reason.
- One design system per repo (do not mix Chakra/Ant/Bootstrap/etc.).
## Setup (Next.js/React)
- Install: `@mui/material @emotion/react @emotion/styled`
- If using icons: `@mui/icons-material`
- If using data grid: `@mui/x-data-grid` (or pro if licensed)
## Theming rules
- Define a single theme (typography, spacing, palette) and reuse everywhere.
- Use semantic colors (primary/secondary/error/warning/success/info), not hard-coded hex everywhere.
- Prefer MUIs `sx` for small styling; use `styled()` for reusable components.
## “Portal” patterns (modals, popovers, menus)
- Use MUI Dialog/Modal/Popover/Menu components instead of DIY portals.
- Accessibility requirements:
- Focus is trapped in Dialog/Modal.
- Escape closes modal unless explicitly prevented.
- All inputs have labels; buttons have clear text/aria-labels.
- Keyboard navigation works end-to-end.
## Layout conventions (for portals)
- Use: AppBar + Drawer (or NavigationRail equivalent) + main content.
- Keep pages as composition of small components: Page → Sections → Widgets.
- Keep forms consistent: FormControl + helper text + validation messages.
## Performance hygiene
- Avoid re-render storms: memoize heavy lists; use virtualization for large tables (DataGrid).
- Prefer server pagination for huge datasets.
## PR review checklist
- Theme is used (no random styling).
- Components are MUI where reasonable.
- Modal/popover accessibility is correct.
- No mixed UI libraries.

View File

@@ -0,0 +1,15 @@
# Security & Safety
## Secrets
- Never output secrets or tokens.
- Never log sensitive inputs.
- Never commit credentials.
## Inputs
- Validate external inputs at boundaries.
- Fail closed for auth/security decisions.
## Tooling
- No destructive commands unless requested and scoped.
- Prefer read-only operations first.

View File

@@ -0,0 +1,13 @@
# Docs & Artifacts
Update documentation when:
- setup steps change
- env vars change
- endpoints/CLI behavior changes
- data formats change
Docs standards:
- Provide copy/paste commands
- Provide expected outputs where helpful
- Keep it short and accurate

11
SKILLS/80-mcp-tools.md Normal file
View File

@@ -0,0 +1,11 @@
# MCP Tools Skill (Optional)
If this repo defines MCP servers/tools:
Rules:
- Tool calls must be explicit and logged.
- Maintain an allowlist of tools; deny by default.
- Every tool must have: purpose, inputs/outputs schema, examples, and tests.
- Prefer idempotent tool operations.
- Never add tools that can exfiltrate secrets without strict guards.

View File

@@ -0,0 +1,51 @@
# MCP Server Design (Agent-First)
Build MCP servers like youre designing a UI for a non-human user.
This skill distills Phil Schmids MCP server best practices into concrete repo rules.
Source: “MCP is Not the Problem, Its your Server” (Jan 21, 2026).
## 1) Outcomes, not operations
- Do **not** wrap REST endpoints 1:1 as tools.
- Expose high-level, outcome-oriented tools.
- Bad: `get_user`, `list_orders`, `get_order_status`
- Good: `track_latest_order(email)` (server orchestrates internally)
## 2) Flatten arguments
- Prefer top-level primitives + constrained enums.
- Avoid nested `dict`/config objects (agents hallucinate keys).
- Defaults reduce decision load.
## 3) Instructions are context
- Tool docstrings are *instructions*:
- when to use the tool
- argument formatting rules
- what the return means
- Error strings are also context:
- return actionable, self-correcting messages (not raw stack traces)
## 4) Curate ruthlessly
- Aim for **515 tools** per server.
- One server, one job. Split by persona if needed.
- Delete unused tools. Dont dump raw data into context.
## 5) Name tools for discovery
- Avoid generic names (`create_issue`).
- Prefer `{service}_{action}_{resource}`:
- `velociraptor_run_hunt`
- `github_list_prs`
- `slack_send_message`
## 6) Paginate large results
- Always support `limit` (default ~2050).
- Return metadata: `has_more`, `next_offset`, `total_count`.
- Never return hundreds of rows unbounded.
## Repo conventions
- Put MCP tool specs in `mcp/` (schemas, examples, fixtures).
- Provide at least 1 “golden path” example call per tool.
- Add an eval that checks:
- tool names follow discovery convention
- args are flat + typed
- responses are concise + stable
- pagination works

View File

@@ -0,0 +1,40 @@
# FastMCP 3 Patterns (Providers + Transforms)
Use this skill when you are building MCP servers in Python and want:
- composable tool sets
- per-user/per-session behavior
- auth, versioning, observability, and long-running tasks
## Mental model (FastMCP 3)
FastMCP 3 treats everything as three composable primitives:
- **Components**: what you expose (tools, resources, prompts)
- **Providers**: where components come from (decorators, files, OpenAPI, remote MCP, etc.)
- **Transforms**: how you reshape what clients see (namespace, filters, auth, versioning, visibility)
## Recommended architecture for Marcs platform
Build a **single “Cyber MCP Gateway”** that composes providers:
- LocalProvider: core cyber tools (run hunt, parse triage, generate report)
- OpenAPIProvider: wrap stable internal APIs (ticketing, asset DB) without 1:1 endpoint exposure
- ProxyProvider/FastMCPProvider: mount sub-servers (e.g., Velociraptor tools, Intel feeds)
Then apply transforms:
- Namespace per domain: `hunt.*`, `intel.*`, `pad.*`
- Visibility per session: hide dangerous tools unless user/role allows
- VersionFilter: keep old clients working while you evolve tools
## Production must-haves
- **Tool timeouts**: never let a tool hang forever
- **Pagination**: all list tools must be bounded
- **Background tasks**: use for long hunts / ingest jobs
- **Tracing**: emit OpenTelemetry traces so you can debug agent/tool behavior
## Auth rules
- Prefer component-level auth for “dangerous” tools.
- Default stance: read-only tools visible; write/execute tools gated.
## Versioning rules
- Version your components when you change schemas or semantics.
- Keep 1 previous version callable during migrations.
## Upgrade guidance
FastMCP 3 is in beta; pin to v2 for stability in production until youve tested.