import os lines = [] a = lines.append a("# ThreatHunt Update Log") a("") a("## 2026-02-22: Full Auto-Processing Pipeline, Performance Fixes, DB Concurrency") a("") a("### Auto-Processing Pipeline (Import-Time)") a("- **Problem**: Only HOST_INVENTORY ran on dataset upload. Triage, anomaly detection, keyword scanning, and IOC extraction were manual-only, effectively dead code.") a("- **Solution**: Wired ALL processing modules into the upload endpoint. On CSV import, 5 jobs are now auto-queued: TRIAGE, ANOMALY, KEYWORD_SCAN, IOC_EXTRACT, HOST_INVENTORY.") a("- **Startup reprocessing**: On backend boot, queries for datasets with no anomaly results and queues the full pipeline for them.") a("- **Completion tracking**: Pipeline completion callback updates `Dataset.processing_status` to `completed` or `completed_with_errors` when all 4 analysis jobs finish.") a("- **Triage chaining**: After triage completes, automatically queues a HOST_PROFILE job for deep per-host LLM analysis.") a("") a("### Artifact Classification (Was Dead Code)") a("- **Problem**: `classify_artifact()` in `artifact_classifier.py` existed but was never called.") a("- **Fix**: Upload endpoint now calls `classify_artifact(columns)` to identify Velociraptor artifact types (30+ fingerprints) and stores `artifact_type` on the dataset.") a("") a("### Database Concurrency Fix") a("- **Problem**: SQLite with `StaticPool` = single shared connection. Any long-running job (keyword scan, triage) blocked ALL other DB queries, freezing the entire app.") a("- **Fix**: Switched to `NullPool` so each async session gets its own connection. Combined with WAL mode (`PRAGMA journal_mode=WAL`), `busy_timeout=30000`, and `synchronous=NORMAL` for concurrent reads during writes.") a("") a("#### Modified: `backend/app/db/engine.py`") a("- `StaticPool` -> `NullPool` for SQLite") a("- Added `_set_sqlite_pragmas` event listener: WAL mode, 30s busy timeout, NORMAL sync") a("- Connection args: `timeout=60`, `check_same_thread=False`") a("") a("### Triage Model Fix") a("- **Problem**: `triage.py` hardcoded `DEFAULT_FAST_MODEL = \"qwen2.5-coder:7b-instruct-q4_K_M\"` which didn't exist on Roadrunner, causing 404 errors on every triage batch.") a("- **Fix**: Changed to `settings.DEFAULT_FAST_MODEL` which resolves to `llama3.1:latest` (available on Roadrunner). Configurable via `TH_DEFAULT_FAST_MODEL` env var.") a("") a("### Host Profiler ClientID Fix") a("- **Problem**: Velociraptor ClientID-format hostnames (`C.82465a50d075ea20`) were sent to the LLM for profiling, producing empty/useless results.") a("- **Fix**: Added regex filter `^C\\.[0-9a-fA-F]{8,}$` to skip ClientID entries before profiling.") a("") a("### Job Queue Expansion") a("- **Before**: 3 job types (TRIAGE, HOST_PROFILE, REPORT), 3 workers") a("- **After**: 8 job types, 5 workers, pipeline completion callbacks") a("- Added: KEYWORD_SCAN, IOC_EXTRACT to JobType enum") a("- Added: `PIPELINE_JOB_TYPES` frozenset (TRIAGE, ANOMALY, KEYWORD_SCAN, IOC_EXTRACT)") a("- Added: `_on_pipeline_job_complete` callback updates `processing_status`") a("- Added: `_handle_keyword_scan` using `KeywordScanner(db).scan()`") a("- Added: `_handle_ioc_extract` using `extract_iocs_from_dataset()`") a("- Triage now chains HOST_PROFILE after completion") a("") a("#### Modified: `backend/app/api/routes/datasets.py`") a("- Upload calls `classify_artifact(columns)` for artifact type detection") a("- Sets `artifact_type` and `processing_status=\"processing\"` on create") a("- Queues 5 jobs: TRIAGE, ANOMALY, KEYWORD_SCAN, IOC_EXTRACT, HOST_INVENTORY") a("- `UploadResponse` includes `artifact_type`, `processing_status`, `jobs_queued`") a("") a("#### Modified: `backend/app/main.py`") a("- Startup reprocessing: finds datasets with no `AnomalyResult` records, queues full pipeline") a("- Marks reprocessed datasets as `processing_status=\"processing\"`") a("- Logs skip message when all datasets already processed") a("") a("### Network Map Performance Fix") a("- **Problem**: 163 hosts + 1121 connections created 528 total nodes (365 external IPs). The O(N^2) force simulation did 278,784 pairwise calculations per animation frame, freezing the browser.") a("- **Fix**: 6 optimizations applied to `frontend/src/components/NetworkMap.tsx`:") a("") a("| Fix | Detail |") a("|-----|--------|") a("| Cap external IPs | `MAX_EXTERNAL_NODES = 30` (was unlimited: 365) |") a("| Sampling simulation | For N > 150 nodes, sample 40 random per node instead of N^2 pairs |") a("| Distance cutoff | Skip repulsion for pairs > 600px apart |") a("| Single redraw on hover | Was restarting full animation loop on every mouse hover |") a("| Faster alpha decay | 0.97 -> 0.93 per frame (settles ~2x faster) |") a("| Lower initial energy | simAlpha 0.6 -> 0.3, sim steps 80 -> 60 |") a("") a("### Test Results") a("- **79/79 backend tests passing** (0.72s)") a("- Both Docker containers healthy") a("- 21/21 frontend-facing endpoints return 200 OK through nginx") a("") a("### Endpoint Verification (via nginx on port 3000)") a("") a("| Endpoint | Status | Size |") a("|----------|--------|------|") a("| /api/agent/health | 200 | 522b |") a("| /api/hunts | 200 | 259b |") a("| /api/datasets?hunt_id=... | 200 | 23KB |") a("| /api/datasets/{id}/rows | 200 | 144KB |") a("| /api/analysis/anomalies/{id} | 200 | 104KB |") a("| /api/analysis/iocs/{id} | 200 | 1.2KB |") a("| /api/analysis/triage/{id} | 200 | 9.5KB |") a("| /api/analysis/profiles/{hunt} | 200 | 177KB |") a("| /api/network/host-inventory | 200 | 181KB |") a("| /api/timeline/hunt/{hunt} | 200 | 351KB |") a("| /api/keywords/themes | 200 | 23KB |") a("| /api/playbooks/templates | 200 | 2.5KB |") a("| /api/reports/hunt/{hunt} | 200 | 10.6KB |") a("| /api/export/stix/{hunt} | 200 | 391b |") a("") a("---") a("") a("## 2026-02-21: Feature Expansion, Dashboard Rewrite, Docker Deployment") a("") a("### New Features Added") a("- **MITRE ATT&CK Matrix** (`/api/mitre/coverage`, `MitreMatrix.tsx`) - technique coverage visualization") a("- **Timeline View** (`/api/timeline/hunt/{hunt}`, `TimelineView.tsx`) - chronological event explorer") a("- **Playbook Manager** (`/api/playbooks`, `PlaybookManager.tsx`) - investigation playbook CRUD with templates") a("- **Saved Searches** (`/api/searches`, `SavedSearches.tsx`) - save/run named queries") a("- **STIX Export** (`/api/export/stix/{hunt}`) - STIX 2.1 bundle export for threat intel sharing") a("") a("### DB Models Added") a("- `Playbook`, `PlaybookStep` - investigation playbook tracking") a("- `SavedSearch` - persisted named queries") a("") a("### Dashboard & Correlation Rewrite") a("- `Dashboard.tsx` - rewrote with live stat cards, dataset table, processing status indicators") a("- `CorrelationView.tsx` - rewrote with working correlation analysis UI") a("- `AgentPanel.tsx` - added SSE streaming for real-time agent responses") a("") a("### Docker Deployment") a("- `Dockerfile.frontend` - added `TSC_COMPILE_ON_ERROR=true` for MUI X v8 compatibility") a("- `nginx.conf` - SSE proxy headers, 500MB upload, 300s proxy timeout, SPA fallback") a("- Frontend healthcheck changed from wget to curl with 127.0.0.1") a("") a("---") a("") a("## 2026-02-20: Host-Centric Network Map & Analysis Platform") a("") a("### Network Map Overhaul") a("- **Problem**: Network Map showed 409 misclassified domain nodes (mostly process names like svchost.exe) and 0 hosts. No deduplication.") a("- **Root Cause**: IOC column detection misclassified `Fqdn` as domain instead of hostname; `Name` column (process names) wrongly tagged as domain IOC.") a("- **Solution**: Created host-centric inventory system. Scans all datasets, groups by `Fqdn`/`ClientId`, extracts IPs, users, OS, and network connections.") a("") a("#### New Backend Files") a("- `host_inventory.py` - Deduplicated host inventory builder with in-memory cache, background job pattern (202 polling), 5000-row batches") a("- `network.py` routes - `GET /api/network/host-inventory`, `/inventory-status`, `/rebuild-inventory`") a("- `ioc_extractor.py` - Regex IOC extraction (IP, domain, hash, email, URL)") a("- `anomaly_detector.py` - Embedding-based outlier detection via bge-m3") a("- `data_query.py` - Natural language to structured query translation") a("- `load_balancer.py` - Round-robin load balancer for Ollama LLM nodes") a("- `job_queue.py` - Async job queue (initially 3 workers, 3 job types)") a("- `analysis.py` routes - 16 analysis endpoints") a("") a("#### Frontend") a("- `NetworkMap.tsx` - Canvas 2D force-directed graph, HiDPI, node dragging, search, popover, module-level cache") a("- `AnalysisDashboard.tsx` - 6-tab analysis dashboard") a("- `client.ts` - `network.*` and `analysis.*` API namespaces") a("") a("### Results (Radio Hunt - 20 Velociraptor datasets, 394K rows)") a("") a("| Metric | Before | After |") a("|--------|--------|-------|") a("| Nodes shown | 409 misclassified domains | **163 unique hosts** |") a("| Hosts identified | 0 | **163** |") a("| With IP addresses | N/A | **48** (172.17.x.x LAN) |") a("| With logged-in users | N/A | **43** (real names only) |") a("| OS detected | None | **Windows 10** (inferred) |") a("| Deduplication | None | **Full** (by FQDN/ClientId) |") a("") a("### LLM Infrastructure") a("- **Roadrunner** (100.110.190.11:11434): llama3.1:latest, qwen2.5-coder:7b, qwen2.5:14b, bge-m3 embeddings") a("- **Wile** (100.110.190.12:11434): llama3.1:70b-instruct-q4_K_M (heavy analysis)") a("- **Open WebUI** (ai.guapo613.beer): Cluster management interface") path = r'd:\Projects\Dev\ThreatHunt\update.md' with open(path, 'w', encoding='utf-8') as f: f.write('\n'.join(lines) + '\n') print(f'Written {len(lines)} lines to update.md')