- NetworkMap: hunt-scoped force-directed graph with click-to-inspect popover - NetworkMap: zoom/pan (wheel, drag, buttons), viewport transform - NetworkMap: clickable IP/Host/Domain/URL legend chips to filter node types - NetworkMap: brighter colors, 20% smaller nodes - DatasetViewer: IOC columns highlighted with colored headers + cell tinting - AUPScanner: hunt dropdown replacing dataset checkboxes, auto-select all - Rename 'Social Media (Personal)' theme to 'Social Media' with DB migration - Fix /api/hunts timeout: Dataset.rows lazy='noload' (was selectin cascade) - Add OS column mapping to normalizer - Full backend services, DB models, alembic migrations, new routes - New components: Dashboard, HuntManager, FileUpload, NetworkMap, etc. - Docker Compose deployment with nginx reverse proxy
9.8 KiB
ThreatHunt Analyst-Assist Agent Implementation
Overview
This implementation adds an analyst-assist agent to ThreatHunt that provides read-only guidance on CSV artifact data, analytical pivots, and hypotheses. The agent strictly adheres to the governance principles defined in goose-core/governance/AGENT_POLICY.md.
Architecture
Backend Stack
- Framework: FastAPI (Python 3.11)
- Agent Module:
backend/app/agents/core.py: ThreatHuntAgent class with guidance logicproviders.py: Pluggable LLM provider interfaceconfig.py: Configuration management
Frontend Stack
- Framework: React with TypeScript
- Components: AgentPanel chat interface
- Styling: CSS with responsive design
API Endpoint
- POST /api/agent/assist: Request analyst guidance
- GET /api/agent/health: Check agent availability
LLM Provider Architecture
The agent supports three provider types, selectable via configuration:
1. Local Provider
Use Case: On-device or on-premise models
Environment variables:
THREAT_HUNT_AGENT_PROVIDER=local
THREAT_HUNT_LOCAL_MODEL_PATH=/path/to/model.gguf
Supported frameworks:
- llama-cpp-python (GGML models)
- Ollama API
- vLLM
- Other local inference engines
2. Networked Provider
Use Case: Shared internal inference services
Environment variables:
THREAT_HUNT_AGENT_PROVIDER=networked
THREAT_HUNT_NETWORKED_ENDPOINT=http://inference-service:5000
THREAT_HUNT_NETWORKED_KEY=api-key-here
Supported architectures:
- Internal inference service API
- LLM inference container clusters
- Enterprise inference gateways
3. Online Provider
Use Case: External hosted APIs
Environment variables:
THREAT_HUNT_AGENT_PROVIDER=online
THREAT_HUNT_ONLINE_API_KEY=sk-your-api-key
THREAT_HUNT_ONLINE_PROVIDER=openai
THREAT_HUNT_ONLINE_MODEL=gpt-3.5-turbo
Supported providers:
- OpenAI (GPT-3.5, GPT-4)
- Anthropic Claude
- Google Gemini
- Other hosted LLM services
Auto Provider Selection
Set THREAT_HUNT_AGENT_PROVIDER=auto to automatically use the first available provider:
- Local (if model path exists)
- Networked (if endpoint is configured)
- Online (if API key is set)
Backend Implementation
Agent Request/Response Flow
Request (AgentContext):
{
"query": "What patterns suggest suspicious file modifications?",
"dataset_name": "FileList-2025-12-26",
"artifact_type": "FileList",
"host_identifier": "DESKTOP-ABC123",
"data_summary": "File listing from system scan",
"conversation_history": [...]
}
Response (AgentResponse):
{
"guidance": "Based on the files listed, ...",
"confidence": 0.8,
"suggested_pivots": ["Analyze temporal patterns", "Cross-reference with IOCs"],
"suggested_filters": ["Filter by modification time", "Sort by file size"],
"caveats": "Guidance is based on available data context...",
"reasoning": "Analysis generated based on patterns..."
}
Governance Enforcement
The agent is designed with hard constraints to ensure compliance:
-
Read-Only: Agent accepts context data but cannot:
- Execute tools or actions
- Modify database or schema
- Escalate findings to alerts
- Access external systems
-
Advisory Only: All guidance is clearly marked as:
- Suggestions, not directives
- Confidence-rated
- Accompanied by caveats
- Attributed to the agent
-
Analyst Control: The UI emphasizes:
- Agent provides guidance only
- Analysts retain all decision-making authority
- All next steps require analyst action
Frontend Implementation
AgentPanel Component
Located in frontend/src/components/AgentPanel.tsx:
Features:
- Chat-style interface for analyst questions
- Context display showing current dataset/host/artifact
- Rich response formatting with:
- Main guidance text
- Suggested analytical pivots (clickable)
- Suggested data filters
- Confidence scores
- Caveats and assumptions
- Reasoning explanation
- Conversation history for context
- Responsive design (desktop and mobile)
- Loading states and error handling
Props:
interface AgentPanelProps {
dataset_name?: string;
artifact_type?: string;
host_identifier?: string;
data_summary?: string;
onAnalysisAction?: (action: string) => void;
}
Integration in Main UI
The agent panel is integrated into the main ThreatHunt dashboard as a sidebar component. In App.tsx:
- Main analysis view occupies left side
- Agent panel occupies right sidebar
- Context automatically updated when analyst switches datasets/hosts
- Responsive layout: stacks vertically on mobile
Configuration
Environment Variables
# Provider selection
THREAT_HUNT_AGENT_PROVIDER=auto # auto, local, networked, or online
# Local provider
THREAT_HUNT_LOCAL_MODEL_PATH=/models/model.gguf
# Networked provider
THREAT_HUNT_NETWORKED_ENDPOINT=http://service:5000
THREAT_HUNT_NETWORKED_KEY=api-key
# Online provider
THREAT_HUNT_ONLINE_API_KEY=sk-key
THREAT_HUNT_ONLINE_PROVIDER=openai
THREAT_HUNT_ONLINE_MODEL=gpt-3.5-turbo
# Agent behavior
THREAT_HUNT_AGENT_MAX_TOKENS=1024
THREAT_HUNT_AGENT_REASONING=true
THREAT_HUNT_AGENT_HISTORY_LENGTH=10
THREAT_HUNT_AGENT_FILTER_SENSITIVE=true
# Frontend
REACT_APP_API_URL=http://localhost:8000
Docker Deployment
Use docker-compose.yml for full stack deployment:
# Build and start services
docker-compose up -d
# Verify health
curl http://localhost:8000/api/agent/health
curl http://localhost:3000
# View logs
docker-compose logs -f backend
docker-compose logs -f frontend
# Stop services
docker-compose down
Security Considerations
- API Access: Backend should be protected with authentication in production
- LLM Privacy: Sensitive data (IPs, usernames) should be filtered before sending to online providers
- Error Messages: Production should use generic error messages, not expose internal details
- Rate Limiting: Implement rate limiting on agent endpoints
- Conversation History: Consider data retention policies for conversation logs
Testing
Manual Testing
-
Agent Health:
curl http://localhost:8000/api/agent/health -
Agent Assistance (without frontend):
curl -X POST http://localhost:8000/api/agent/assist \ -H "Content-Type: application/json" \ -d '{ "query": "What suspicious patterns do you see?", "dataset_name": "FileList", "artifact_type": "FileList", "host_identifier": "HOST123" }' -
Frontend UI:
- Navigate to http://localhost:3000
- Type question in agent panel
- Verify response displays correctly
Future Enhancements
- Structured Output: Use LLM JSON mode or function calling for more reliable parsing
- Context Filtering: Automatically filter sensitive data before sending to LLM
- Multi-Modal: Support image uploads (binary analysis, network diagrams)
- Caching: Cache common agent responses to reduce latency
- Feedback Loop: Capture analyst feedback on guidance quality
- Integration: Connect agent to actual CVE databases, threat feeds
- Custom Models: Support fine-tuned models for threat hunting domain
- Audit Trail: Comprehensive logging of all agent interactions
Governance Compliance
This implementation strictly follows:
goose-core/governance/AGENT_POLICY.md- Agent boundaries and allowed functionsgoose-core/governance/AI_RULES.md- AI system rulesgoose-core/governance/SCOPE.md- Shared vs application-specific responsibilityThreatHunt/THREATHUNT_INTENT.md- Agent role in threat hunting
Key Principles:
- ✅ Agents assist analysts, never act autonomously
- ✅ No execution without explicit analyst approval
- ✅ No database or schema changes
- ✅ No alert escalation
- ✅ Read-only guidance
- ✅ Transparent reasoning and caveats
- ✅ Analyst retains all authority
Troubleshooting
Agent Unavailable (503)
- Check environment variables for provider configuration
- Verify LLM provider is accessible
- Review backend logs:
docker-compose logs backend
Slow Responses
- Check LLM provider latency
- Reduce MAX_TOKENS if appropriate
- Consider local provider for latency-sensitive deployments
No Responses from Frontend
- Verify backend health:
curl http://localhost:8000/api/agent/health - Check browser console for errors
- Verify REACT_APP_API_URL in frontend environment
- Check CORS configuration if frontend hosted separately
File Structure
ThreatHunt/
├── backend/
│ ├── app/
│ │ ├── agents/ # Agent module
│ │ │ ├── __init__.py
│ │ │ ├── core.py # ThreatHuntAgent class
│ │ │ ├── providers.py # LLM provider interface
│ │ │ └── config.py # Agent configuration
│ │ ├── api/
│ │ │ ├── routes/
│ │ │ │ ├── __init__.py
│ │ │ │ └── agent.py # /api/agent/* endpoints
│ │ ├── __init__.py
│ │ └── main.py # FastAPI app
│ ├── requirements.txt
│ └── run.py
├── frontend/
│ ├── src/
│ │ ├── components/
│ │ │ ├── AgentPanel.tsx # Agent chat component
│ │ │ └── AgentPanel.css
│ │ ├── utils/
│ │ │ └── agentApi.ts # API communication
│ │ ├── App.tsx # Main app with agent
│ │ ├── App.css
│ │ ├── index.tsx
│ ├── public/
│ │ └── index.html
│ ├── package.json
│ └── tsconfig.json
├── Dockerfile.backend
├── Dockerfile.frontend
├── docker-compose.yml
├── .env.example
├── AGENT_IMPLEMENTATION.md # This file
├── README.md
└── THREATHUNT_INTENT.md