27 KiB
ThreatHunt — Session Update (February 19–20, 2026)
Version: 0.4.0
Summary
This document covers feature improvements and bug fixes across the ThreatHunt platform.
1. Network Map — Clickable Type Filters (IP / Host / Domain / URL)
Request: "IP, Host, Domain, URL on the network map… can I click on these and they would be filtered in or out"
Changes (NetworkMap.tsx):
- Legend chips (IP, Host, Domain, URL) are now clickable toggle filters
- Added
visibleTypesstate (Set<NodeType>) andfilteredGraphuseMemo that filters nodes + edges by visible types - Active chips: filled background with type color, fully opaque
- Inactive chips: outlined with type color, dimmed at 50% opacity
- Each chip shows node count for that type, e.g.
IP (42) - At least one type must stay visible (can't disable all four)
- Stats chips (nodes/edges) update to reflect the filtered view
- All mouse handlers (pan, hover, click, wheel zoom) updated to use
filteredGraphinstead of rawgraph
2. Network Map — Cleaner Nodes (Brighter Colors, 20% Smaller)
Request: "Clean up the icons, the colours are dull and the icons are too big in the map… shrink by 20%"
Changes (NetworkMap.tsx):
- Colors bumped to more saturated variants:
- IP:
#60a5fa→#3b82f6 - Host:
#10b981→#22c55e - Domain:
#f59e0b→#eab308 - URL:
#a78bfa→#8b5cf6
- IP:
- Node radius shrunk ~20%:
- Max: 22 → 18
- Base: 5 → 4
- Multiplier: 2.0 → 1.6
3. Dataset Viewer — IOC Column Highlighting
Request: "I'm looking at one of the datasets and 4 IOCs are showing… but I can't see it… is there a way to highlight that"
Changes (DatasetViewer.tsx):
- IOC columns in the DataGrid are now visually highlighted
- Header: colored background tint + bold colored text + IOC type label (e.g.
src_ip ◆ IP) - Cells: subtle colored background + left border stripe
- Color-coded by IOC type matching the Network Map palette:
- IP — blue (
#3b82f6) - Hostname — green (
#22c55e) - Domain — amber (
#eab308) - URL — purple (
#8b5cf6) - Hashes (MD5/SHA1/SHA256) — rose (
#f43f5e)
- IP — blue (
- Added
IOC_COLORSmapping,iocTypeFor()helper, and dynamicheaderClassName/cellClassNameon DataGrid columns - CSS-in-JS styles injected via DataGrid
sxprop
4. AUP Scanner — "Social Media (Personal)" → "Social Media" Rename
Request: "On the AUP page there is Social Media (Personal) — can we remove personal, it's messing up with the formatting"
Changes (keyword_defaults.py):
- Default theme key renamed from
"Social Media (Personal)"to"Social Media" - Added rename migration in
seed_defaults(): checks for old name in DB, if found renames via SQL UPDATE + commit before normal seeding - Backend log confirmed:
Renamed AUP theme 'Social Media (Personal)' → 'Social Media'
5. AUP Scanner — Hunt Dropdown (previous session, deployed)
- Replaced individual dataset checkboxes with a hunt selector dropdown
- Selecting a hunt auto-loads and selects all its datasets
- Shows dataset/row counts below the dropdown
6. Network Map — Hunt-Scoped Interactive Map (previous session, deployed)
- Hunt selector dropdown loads only that hunt's datasets
- Enriched nodes with hostname, IP, OS metadata
- Click-to-inspect MUI Popover showing node details
- Zoom (wheel + buttons) and pan (drag) with full viewport transform
- Force-directed layout with co-occurrence edges
7. Agent Assist — "Failed to Fetch" Diagnosis
Request: "AI assist failed: Error: Failed to fetch"
Diagnosis:
- Backend agent endpoint works correctly (tested via PowerShell
Invoke-RestMethodthrough nginx proxy) - Health endpoint healthy — both LLM nodes (Wile + Roadrunner) available
- Extra fields sent by frontend (
mode,hunt_id,conversation_id) are accepted by Pydantic v2 (ignored, not rejected) - "Failed to fetch" was a transient browser-level network error, not a backend issue
- Response time ~5s from LLM — within nginx 120s proxy timeout
- Resolution: Hard refresh (Ctrl+Shift+R) resolves the issue
8. Performance Fix — /api/hunts Timeout (previous session)
- Root cause:
Dataset.rowsrelationship hadlazy="selectin"causing SQLAlchemy to cascade-load every DatasetRow when listing hunts - Fix: Changed
Dataset.rowsandDatasetRow.annotationstolazy="noload"inbackend/app/db/models.py - Result: Hunts endpoint returns instantly
Files Modified This Session
| File | Change |
|---|---|
frontend/src/components/NetworkMap.tsx |
Type filter toggles, brighter colors, smaller nodes |
frontend/src/components/DatasetViewer.tsx |
IOC column highlighting in DataGrid |
backend/app/services/keyword_defaults.py |
Theme rename + DB migration |
Deployment
All changes built and deployed via Docker Compose:
docker compose build --no-cache frontend
docker compose up -d frontend
docker compose build --no-cache backend
docker compose up -d backend
Git
Committed and pushed to GitHub (main branch):
92 files changed, 13050 insertions(+), 1097 deletions(-)
d0c9f88..9b98ab9 main -> main
9. Network Picture — Deduplicated Host Inventory (February 20, 2026)
Request: "While files are being dropped in, give me a clean clear network picture — like netstat. No 100 iterations of the same computer. Clean pic: hostname, IP, and any users logged into that machine."
What It Does
New Net Picture page that scans all datasets in a hunt and produces a one-row-per-host inventory table. If a host appears in 900 netstat rows, it shows once — with all unique IPs, usernames, OS versions, MACs, and ports aggregated via server-side set deduplication. Supports networks with up to 1000 hosts; no artificial caps on unique values per host.
Backend
New service — backend/app/services/network_inventory.py:
build_network_picture(db, hunt_id)streams all dataset rows in batches of 1000- Groups rows by
hostname(case-insensitive), falls back tosrc_ip/ip_addressif no hostname column - Per host, aggregates into Python
set()s: IPs, users, OS, MAC addresses, protocols, open ports, remote targets, dataset sources - Tracks
connection_count(total rows),first_seen/last_seenfrom timestamp columns - Returns sorted by connection count descending
New API route — backend/app/api/routes/network.py:
GET /api/network/picture?hunt_id={id}→NetworkPictureResponse- Response:
hosts[](each with hostname, ips, users, os, mac_addresses, protocols, open_ports, remote_targets, datasets, connection_count, first_seen, last_seen) +summary(total_hosts, total_connections, total_unique_ips, datasets_scanned)
Normalizer additions — backend/app/services/normalizer.py:
- Added
mac_addresscanonical mapping:mac,mac_address,physical_address,mac_addr,hw_addr,ethernet - Added
connection_statecanonical mapping:state,status,tcp_state,conn_state
Frontend
New component — frontend/src/components/NetworkPicture.tsx:
- Hunt selector dropdown → loads network picture from backend
- MUI Table with sortable columns: Hostname, IPs, Users, OS, MAC, Connections, Ports
- ChipList sub-component: shows first 5 values inline as coloured Chips; "+N more" badge expands to show all (no data hidden, just visual overflow control)
- Click any row → expand panel showing: remote targets, all open ports, protocols, MAC addresses, dataset sources, time range
- Search bar filters by hostname, IP, user, OS, or MAC
- Summary stat cards: total hosts, unique IPs, connections, datasets scanned
- Colour palette matches Network Map: IP blue, User green, OS amber, MAC purple, Port rose
App integration — frontend/src/App.tsx:
- New nav item: Net Picture with
DevicesIcon, route/netpicture - Import and route for
NetworkPicturecomponent
API client — frontend/src/api/client.ts:
- Added
HostEntry,PictureSummary,NetworkPictureResponseinterfaces - Added
network.picture(huntId)method
Files Modified
| File | Change |
|---|---|
backend/app/services/normalizer.py |
Added mac_address + connection_state mappings |
backend/app/services/network_inventory.py |
New — host aggregation service |
backend/app/api/routes/network.py |
New — /api/network/picture endpoint |
backend/app/main.py |
Registered network router |
frontend/src/api/client.ts |
Added network picture types + API method |
frontend/src/components/NetworkPicture.tsx |
New — host inventory table component |
frontend/src/App.tsx |
Added Nav item + route for Net Picture |
10. Test Data — Velociraptor Mock Network CSVs (February 20, 2026)
Request: "Create 10-15 different CSV files you'd expect from Velociraptor for a mock network with 50-100 hosts and random network traffic with a few keywords that would trigger the AUP."
What Was Created
12 realistic Velociraptor-style CSV files in backend/tests/test_csvs/, generated by a Python script (generate_test_csvs.py) with deterministic seed for reproducibility.
Mock Network:
- 82 hosts: 75 workstations (IT-WS, HR-WS, FIN-WS, SLS-WS, ENG-WS, LEG-WS, MKT-WS, EXEC-WS) + 7 servers (DC-01, DC-02, FILE-01, EXCH-01, WEB-01, SQL-01, PROXY-01)
- 14 users: 10 named users (jsmith, agarcia, bwilson, etc.) + 4 service accounts
- 3 subnets:
10.10.1.x,10.10.2.x,10.10.3.x - Domain:
acme.local - Time range: Feb 10–20, 2026
Files
| # | File | Rows | Velociraptor Artifact |
|---|---|---|---|
| 1 | 01_netstat_connections.csv |
2,012 | Windows.Network.Netstat |
| 2 | 02_dns_queries.csv |
2,964 | Windows.Network.DNS |
| 3 | 03_process_listing.csv |
1,896 | Windows.System.Pslist |
| 4 | 04_network_interfaces.csv |
94 | Windows.Network.Interfaces |
| 5 | 05_logged_in_users.csv |
171 | Windows.Sys.Users |
| 6 | 06_scheduled_tasks.csv |
410 | Windows.System.TaskScheduler |
| 7 | 07_browser_history.csv |
1,586 | Windows.Application.Chrome.History |
| 8 | 08_sysmon_network.csv |
2,517 | Sysmon Event ID 3 |
| 9 | 09_autoruns.csv |
342 | Windows.Sys.AutoRuns |
| 10 | 10_logon_events.csv |
1,604 | Windows Security (4624/4625) |
| 11 | 11_proxy_logs.csv |
2,605 | Web proxy / filter |
| 12 | 12_file_listing.csv |
421 | Windows.Search.FileFinder |
AUP Triggers Embedded
| Category | Examples |
|---|---|
| Gambling | bet365.com, pokerstars.com, draftkings.com |
| Gaming | steam.exe, discord.exe, steamcommunity.com, epicgameslauncher.exe |
| Streaming | netflix.com, hulu.com, spotify.exe, open.spotify.com |
| Downloads / Piracy | thepiratebay.org, 1337x.to, utorrent.exe, qbittorrent.exe, free_movie_2026.torrent |
| Adult Content | pornhub.com, onlyfans.com, xvideos.com |
| Social Media | facebook.com, tiktok.com, reddit.com |
| Job Search | indeed.com, glassdoor.com, linkedin.com/jobs |
| Shopping | amazon.com, ebay.com, shein.com |
AUP keywords appear in: DNS queries (~12%), browser history (~15%), proxy logs (~10%), process listing (~15% of hosts), autoruns (~20% of workstations), sysmon network (~10%), file listing (crack_photoshop.exe, keygen_v2.exe, free_movie_2026.torrent).
Files Added
| File | Purpose |
|---|---|
backend/tests/generate_test_csvs.py |
New — deterministic CSV generator script |
backend/tests/test_csvs/*.csv (×12) |
New — mock Velociraptor test data |
11. Phase 8 — Analyzer Framework & Alerts (February 20, 2026)
What It Does
Automated alerting engine with 6 built-in analyzers that scan dataset rows for suspicious activity and produce scored, MITRE-tagged alerts. Full alert lifecycle (new → acknowledged → in-progress → resolved / false-positive), bulk operations, and configurable alert rules.
Backend
New service — backend/app/services/analyzers.py (~350 lines):
BaseAnalyzerABC withname,description, andasync analyze(rows)method- 6 built-in analyzers:
- EntropyAnalyzer — Shannon entropy on command_line/url/path fields, threshold 4.5, maps to T1027 (Obfuscated Files or Information)
- SuspiciousCommandAnalyzer — 19 regex patterns (mimikatz, encoded PowerShell, schtasks, psexec, vssadmin, etc.) with per-pattern MITRE mappings
- NetworkAnomalyAnalyzer — Beaconing detection (dst IP frequency), suspicious ports (4444, 5555, etc.), large transfers (>10 MB), maps to T1071/T1048/T1571
- FrequencyAnomalyAnalyzer — Flags values <1% occurrence in process_name/username/event_type fields
- AuthAnomalyAnalyzer — Brute force (>5 failed logins per user), unusual logon types (3=network, 10=RDP), maps to T1110/T1021
- PersistenceAnalyzer — Registry Run keys, services, Winlogon, IFEO patterns, maps to T1547/T1543/T1546
run_all_analyzers(rows, analyzers?)— runs all or selected analyzers, returns sortedAlertCandidatelist
New API route — backend/app/api/routes/alerts.py (~300 lines):
GET /api/alerts— list with filters (status, severity, analyzer, hunt_id, dataset_id)GET /api/alerts/stats— severity/status/analyzer/MITRE breakdownsGET /api/alerts/{id},PUT /api/alerts/{id},DELETE /api/alerts/{id}POST /api/alerts/bulk-update— bulk status changes on selected alert IDsGET /api/alerts/analyzers/list— available analyzer metadataPOST /api/alerts/analyze— run analyzers on a dataset/hunt, auto-creates Alert records- Alert Rules CRUD:
GET /api/alerts/rules/list,POST /api/alerts/rules,PUT /api/alerts/rules/{id},DELETE /api/alerts/rules/{id}
New ORM models — backend/app/db/models.py:
Alert— 18 fields (id, title, description, severity, status, analyzer, score, evidence JSON, mitre_technique, tags JSON, hunt_id FK, dataset_id FK, case_id FK, assignee, acknowledged_at, resolved_at, timestamps; indexes on severity/status/hunt/dataset)AlertRule— 10 fields (id, name, description, analyzer, config JSON, severity_override, enabled, hunt_id FK, timestamps; index on analyzer)
Alembic migration — backend/alembic/versions/b4c2d3e5f6a7_add_alerts_and_alert_rules.py
Frontend
New component — frontend/src/components/AlertPanel.tsx (~350 lines):
- Alerts tab: MUI DataGrid with severity/status chips, score, MITRE technique; checkbox selection for bulk actions (acknowledge / resolve / false-positive); click → detail dialog with evidence JSON viewer
- Stats tab: Cards for total, per-severity counts, status/analyzer breakdowns, top MITRE techniques
- Rules tab: Rule cards with enable/disable switch, delete; create rule dialog with analyzer selector
- Selector bar: Hunt + Dataset pickers, "Run Analyzers" button, severity/status filters
API client — frontend/src/api/client.ts:
- Added
AlertData,AlertStats,AlertRuleData,AnalyzerInfo,AnalyzeResultinterfaces - Added
alertsnamespace with full CRUD + analyze + rules methods
App integration — frontend/src/App.tsx:
- New nav item: Alerts with
NotificationsActiveIcon, route/alerts
Files
| File | Change |
|---|---|
backend/app/services/analyzers.py |
New — 6 analyzers + runner |
backend/app/api/routes/alerts.py |
New — alerts API (CRUD, stats, bulk, rules) |
backend/app/db/models.py |
Added Alert + AlertRule models |
backend/alembic/versions/b4c2d3e5f6a7_...py |
New — alerts migration |
frontend/src/components/AlertPanel.tsx |
New — alerts dashboard |
frontend/src/api/client.ts |
Added alert types + alerts namespace |
frontend/src/App.tsx |
Added Alerts nav + route |
12. Phase 9 — Investigation Notebooks & Playbooks (February 20, 2026)
What It Does
Cell-based investigation notebooks (markdown / query / code) for documenting hunts, plus a playbook engine with 4 built-in response templates and step-by-step execution tracking.
Backend
New service — backend/app/services/playbook.py (~250 lines):
NotebookCelldataclass +validate_notebook_cells()helper- 4 built-in playbook templates:
- Suspicious Process Investigation (6 steps): identify → process tree → network → analyzers → MITRE → document
- Lateral Movement Hunt (5 steps): remote tools → auth anomaly → network anomaly → knowledge graph → escalate
- Data Exfiltration Check (5 steps): large transfers → DNS → timeline → correlate → MITRE + document
- Ransomware Triage (5 steps): indicators search → all analyzers → persistence → LLM deep → critical case
- Each step: order, title, description, action, action_config, expected_outcome
New API route — backend/app/api/routes/notebooks.py (~280 lines):
- Notebook CRUD:
GET /api/notebooks,GET /api/notebooks/{id},POST /api/notebooks,PUT /api/notebooks/{id},DELETE /api/notebooks/{id} - Cell operations:
POST /api/notebooks/{id}/cells/upsert,DELETE /api/notebooks/{id}/cells/{cell_id} - Playbook endpoints:
GET /api/notebooks/playbooks/templates,GET /api/notebooks/playbooks/templates/{name},POST /api/notebooks/playbooks/start,GET /api/notebooks/playbooks/runs,GET /api/notebooks/playbooks/runs/{id},POST /api/notebooks/playbooks/runs/{id}/complete-step,POST /api/notebooks/playbooks/runs/{id}/abort
New ORM models — backend/app/db/models.py:
Notebook— 10 fields (id, title, description, cells JSON, hunt_id FK, case_id FK, owner_id FK, tags JSON, timestamps; index on hunt_id)PlaybookRun— 12 fields (id, playbook_name, status, current_step, total_steps, step_results JSON, hunt_id FK, case_id FK, started_by, timestamps, completed_at; indexes on hunt_id/status)
Alembic migration — backend/alembic/versions/c5d3e4f6a7b8_add_notebooks_and_playbook_runs.py
Frontend
New component — frontend/src/components/InvestigationNotebook.tsx (~280 lines):
- List view: Grid of notebook cards with title, description, cell count, tags, open/delete
- Detail view: Cell-based editor with markdown/query/code cell types
- Markdown cells render via ReactMarkdown + remarkGfm; code/query cells as monospace pre blocks
- Edit mode: multiline TextField with Ctrl+S save shortcut
- Cell operations: add (markdown/query/code), edit, save, delete, move up/down
New component — frontend/src/components/PlaybookManager.tsx (~280 lines):
- Templates tab: Grid of template cards with category chips, tags, step count; view detail (MUI Stepper showing all steps) or start new run
- Runs tab: List of active/completed runs with progress bars
- Active run view: Vertical MUI Stepper with step completion, notes field, skip/abort, LinearProgress
API client — frontend/src/api/client.ts:
- Added
NotebookCell,NotebookData,PlaybookTemplate,PlaybookStep,PlaybookTemplateDetail,PlaybookRunDatainterfaces - Added
notebooksnamespace (CRUD + cell ops) andplaybooksnamespace (templates, runs, step-complete, abort)
App integration — frontend/src/App.tsx:
- New nav items: Notebooks (
MenuBookIcon,/notebooks) and Playbooks (PlaylistPlayIcon,/playbooks)
Files
| File | Change |
|---|---|
backend/app/services/playbook.py |
New — 4 playbook templates + cell validation |
backend/app/api/routes/notebooks.py |
New — notebooks + playbooks API |
backend/app/db/models.py |
Added Notebook + PlaybookRun models |
backend/alembic/versions/c5d3e4f6a7b8_...py |
New — notebooks migration |
frontend/src/components/InvestigationNotebook.tsx |
New — cell-based notebook editor |
frontend/src/components/PlaybookManager.tsx |
New — playbook template browser + run stepper |
frontend/src/api/client.ts |
Added notebook + playbook types + namespaces |
frontend/src/App.tsx |
Added Notebooks + Playbooks nav + routes |
13. Docker Build Fixes (February 20, 2026)
Several issues were resolved to get the full 22-page frontend and updated backend deployed in Docker Compose.
npm / TypeScript Fixes
| Issue | Fix |
|---|---|
npm ci failing: "Missing: yaml@2.8.2 from lock file" |
Installed yaml@2 — postcss-load-config (from tailwindcss) required ^2.4.2 but only 1.10.2 was present |
GridRowSelectionModel cast to string[] |
MUI X DataGrid v8 changed to { type, ids: Set } — fixed with Array.from(model.ids) |
Recharts formatter type mismatch |
Changed explicit number/string params to any in Dashboard.tsx |
Missing declarations for cytoscape-dagre / cytoscape-cola |
Created frontend/src/declarations.d.ts |
Missing HuntOut type export |
Added export type HuntOut = Hunt alias in client.ts |
cytoscape.Stylesheet no longer exists |
Renamed to cytoscape.StylesheetStyle in ProcessTree, StorylineGraph, KnowledgeGraph |
Backend Startup Fixes
| Issue | Fix |
|---|---|
init_db() crash: "table cases already exists" |
Rewrote to inspect existing tables and only create missing ones |
| Alembic migration replay on startup | DB had all tables (from init_db()) but Alembic was stamped at 98ab619418bc — ran alembic stamp head to sync to c5d3e4f6a7b8 |
Files Modified
| File | Change |
|---|---|
frontend/package.json / package-lock.json |
Added yaml@2 dependency |
frontend/src/declarations.d.ts |
New — ambient module declarations |
frontend/src/api/client.ts |
Added HuntOut type alias |
frontend/src/components/AlertPanel.tsx |
Fixed GridRowSelectionModel handling |
frontend/src/components/Dashboard.tsx |
Fixed Recharts formatter types |
frontend/src/components/ProcessTree.tsx |
Stylesheet → StylesheetStyle |
frontend/src/components/StorylineGraph.tsx |
Stylesheet → StylesheetStyle |
frontend/src/components/KnowledgeGraph.tsx |
Stylesheet → StylesheetStyle |
backend/app/db/engine.py |
Safe init_db() with inspector |
Current Navigation (22 pages)
Dashboard · Hunts · Datasets · Upload · Agent · Analysis · Annotations · Hypotheses · Correlation · Network Map · Net Picture · Proc Tree · Storyline · Timeline · Search · MITRE Map · Knowledge · Cases · Alerts · Notebooks · Playbooks · AUP Scanner
Deployment
docker compose build
docker compose up -d
# Backend: http://localhost:8000 (healthy)
# Frontend: http://localhost:3000 (200 OK)
Alembic Migration Chain
9790f482da06 (initial schema)
↓
98ab619418bc (keyword themes + keywords)
↓
a3b1c2d4e5f6 (cases + activity logs)
↓
b4c2d3e5f6a7 (alerts + alert rules)
↓
c5d3e4f6a7b8 (notebooks + playbook runs) ← HEAD
14. Process Tree — HTTP 500 Fix (February 20, 2026)
Request: "process tree: HTTP 500"
Root Cause: MissingGreenlet error in _fetch_rows — async SQLAlchemy lazy-loading the dataset relationship on DatasetRow outside a greenlet context.
Fix (backend/app/services/process_tree.py):
- Added
selectinload(DatasetRow.dataset)to the_fetch_rowsquery so the relationship is eagerly loaded within the async session - Endpoint now returns data correctly (verified: 3,908 processes for full hunt)
15. Process Tree — UX Rewrite (February 20, 2026)
Request: "proctree view is horrible… there are 3908 processes and I can't see / zoom in to see and when I do click on one the resolution changes to show the data on the right. Ideally I'd like another dropdown to pick a host."
Complete rewrite of frontend/src/components/ProcessTree.tsx (~520 lines):
- Host dropdown: Autocomplete populated by extracting unique hostnames from tree data (82 hosts)
- Server-side hostname filtering: API
?hostname=param reduces returned data per view - Grid layout fallback: When edges < 10% of nodes (e.g. test data with no parent-child relationships), uses grid layout instead of dagre
- Overlay detail panel: Absolute-positioned panel on click — no longer reflows the graph
- ResizeObserver: Keeps Cytoscape canvas in sync with container size changes
- Search/highlight: Filter processes by name or PID
- Zoom controls: Fit, zoom-in, zoom-out buttons
16. Process Tree — White Screen Crash Fix (February 20, 2026)
Request: "I picked a different host and the screen went white"
Root Cause: Cytoscape's destroy() removed <canvas> elements that were children of the same DOM node React was managing, causing NotFoundError: Failed to execute 'removeChild' on 'Node' when React tried to reconcile.
Fix (frontend/src/components/ProcessTree.tsx):
- Separated Cytoscape container into a plain
<div>with zero React children - Loading spinner and placeholder text are now sibling overlays (not children of the Cytoscape div)
- Added explicit
cyRef.current = nullafterdestroy() - Cleanup on unmount prevents stale references
- Verified: switching between hosts (DC-01 → EXEC-WS-039 → ENG-WS-052) works with 0 console errors
17. LLM Analysis — Response Parsing Fix (February 20, 2026)
Request: "llm analysis: HTTP 500 — searching for evidence of adult site access"
Root Cause Chain:
-
AttributeError: 'dict' object has no attribute 'strip'—OllamaProvider.generate()returns a dict ({"response": "<llm text>", "model": ..., ...}), but_parse_analysisexpected a string. -
Wrong dict key extraction — Initial dict-handling fix looked for
raw.get("analysis")but Ollama's/api/generateendpoint puts the LLM text in the"response"key. -
JSON parsing failures — The 70B model sometimes produces slightly malformed JSON (trailing commas, etc.) that
json.loadsrejects.
Fixes (backend/app/services/llm_analysis.py):
| Change | Detail |
|---|---|
| Dict response extraction | raw.get("response") instead of raw.get("analysis") — correctly extracts LLM text from Ollama API dict |
| Robust JSON parsing | New _extract_json_candidates() generator: tries full text, then outermost {…} block, then trailing-comma-fixed variant |
| Reduced prompt size | max_sample 50 → 20 rows, max_chars 8000 → 6000 |
| Reduced row limit | /llm-analyze endpoint loads 2000 rows (was 5000) |
| Timeout | asyncio.wait_for(..., timeout=300) wraps the LLM call — returns graceful "timed out" message instead of hanging |
| Debug logging | _parse_analysis logs extraction source, text length, parse success/failure with error details |
Results:
- Quick mode (llama3.1:latest on roadrunner): 18s, confidence 0.85, risk score 65, 3 key findings — all structured fields populated
- Deep mode (llama3.1:70b on wile): 128–185s, full analysis returned — JSON parsing succeeds when model produces valid JSON, falls back to plain text otherwise
18. Version Bump — 0.3.0 → 0.4.0 (February 20, 2026)
| File | Change |
|---|---|
backend/app/config.py |
APP_VERSION 0.3.0 → 0.4.0 |
frontend/package.json |
version 0.1.0 → 0.4.0 |
Files Modified This Session (Sections 14–18)
| File | Change |
|---|---|
backend/app/services/process_tree.py |
Added selectinload(DatasetRow.dataset) to _fetch_rows |
frontend/src/components/ProcessTree.tsx |
Complete rewrite: host dropdown, grid layout, overlay panel, Cytoscape DOM separation |
backend/app/services/llm_analysis.py |
Fixed dict response extraction, robust JSON parsing, reduced prompt, added timeout |
backend/app/api/routes/analysis.py |
Row limit 5000 → 2000 |
backend/app/config.py |
Version 0.3.0 → 0.4.0 |
frontend/package.json |
Version 0.1.0 → 0.4.0 |
Deployment
docker compose up -d --build backend
# Backend: http://localhost:8000 (healthy, v0.4.0)
# Frontend: http://localhost:3000 (200 OK)