Add ThreatHunt agent backend/frontend scaffolding

This commit is contained in:
2025-12-29 10:22:57 -05:00
parent dc2dcd02c1
commit d0c9f88268
35 changed files with 21929 additions and 42 deletions

342
AGENT_IMPLEMENTATION.md Normal file
View File

@@ -0,0 +1,342 @@
# ThreatHunt Analyst-Assist Agent Implementation
## Overview
This implementation adds an analyst-assist agent to ThreatHunt that provides read-only guidance on CSV artifact data, analytical pivots, and hypotheses. The agent strictly adheres to the governance principles defined in `goose-core/governance/AGENT_POLICY.md`.
## Architecture
### Backend Stack
- **Framework**: FastAPI (Python 3.11)
- **Agent Module**: `backend/app/agents/`
- `core.py`: ThreatHuntAgent class with guidance logic
- `providers.py`: Pluggable LLM provider interface
- `config.py`: Configuration management
### Frontend Stack
- **Framework**: React with TypeScript
- **Components**: AgentPanel chat interface
- **Styling**: CSS with responsive design
### API Endpoint
- **POST /api/agent/assist**: Request analyst guidance
- **GET /api/agent/health**: Check agent availability
## LLM Provider Architecture
The agent supports three provider types, selectable via configuration:
### 1. Local Provider
**Use Case**: On-device or on-premise models
Environment variables:
```bash
THREAT_HUNT_AGENT_PROVIDER=local
THREAT_HUNT_LOCAL_MODEL_PATH=/path/to/model.gguf
```
Supported frameworks:
- llama-cpp-python (GGML models)
- Ollama API
- vLLM
- Other local inference engines
### 2. Networked Provider
**Use Case**: Shared internal inference services
Environment variables:
```bash
THREAT_HUNT_AGENT_PROVIDER=networked
THREAT_HUNT_NETWORKED_ENDPOINT=http://inference-service:5000
THREAT_HUNT_NETWORKED_KEY=api-key-here
```
Supported architectures:
- Internal inference service API
- LLM inference container clusters
- Enterprise inference gateways
### 3. Online Provider
**Use Case**: External hosted APIs
Environment variables:
```bash
THREAT_HUNT_AGENT_PROVIDER=online
THREAT_HUNT_ONLINE_API_KEY=sk-your-api-key
THREAT_HUNT_ONLINE_PROVIDER=openai
THREAT_HUNT_ONLINE_MODEL=gpt-3.5-turbo
```
Supported providers:
- OpenAI (GPT-3.5, GPT-4)
- Anthropic Claude
- Google Gemini
- Other hosted LLM services
### Auto Provider Selection
Set `THREAT_HUNT_AGENT_PROVIDER=auto` to automatically use the first available provider:
1. Local (if model path exists)
2. Networked (if endpoint is configured)
3. Online (if API key is set)
## Backend Implementation
### Agent Request/Response Flow
**Request** (AgentContext):
```python
{
"query": "What patterns suggest suspicious file modifications?",
"dataset_name": "FileList-2025-12-26",
"artifact_type": "FileList",
"host_identifier": "DESKTOP-ABC123",
"data_summary": "File listing from system scan",
"conversation_history": [...]
}
```
**Response** (AgentResponse):
```python
{
"guidance": "Based on the files listed, ...",
"confidence": 0.8,
"suggested_pivots": ["Analyze temporal patterns", "Cross-reference with IOCs"],
"suggested_filters": ["Filter by modification time", "Sort by file size"],
"caveats": "Guidance is based on available data context...",
"reasoning": "Analysis generated based on patterns..."
}
```
### Governance Enforcement
The agent is designed with hard constraints to ensure compliance:
1. **Read-Only**: Agent accepts context data but cannot:
- Execute tools or actions
- Modify database or schema
- Escalate findings to alerts
- Access external systems
2. **Advisory Only**: All guidance is clearly marked as:
- Suggestions, not directives
- Confidence-rated
- Accompanied by caveats
- Attributed to the agent
3. **Analyst Control**: The UI emphasizes:
- Agent provides guidance only
- Analysts retain all decision-making authority
- All next steps require analyst action
## Frontend Implementation
### AgentPanel Component
Located in `frontend/src/components/AgentPanel.tsx`:
**Features**:
- Chat-style interface for analyst questions
- Context display showing current dataset/host/artifact
- Rich response formatting with:
- Main guidance text
- Suggested analytical pivots (clickable)
- Suggested data filters
- Confidence scores
- Caveats and assumptions
- Reasoning explanation
- Conversation history for context
- Responsive design (desktop and mobile)
- Loading states and error handling
**Props**:
```typescript
interface AgentPanelProps {
dataset_name?: string;
artifact_type?: string;
host_identifier?: string;
data_summary?: string;
onAnalysisAction?: (action: string) => void;
}
```
### Integration in Main UI
The agent panel is integrated into the main ThreatHunt dashboard as a sidebar component. In `App.tsx`:
1. Main analysis view occupies left side
2. Agent panel occupies right sidebar
3. Context automatically updated when analyst switches datasets/hosts
4. Responsive layout: stacks vertically on mobile
## Configuration
### Environment Variables
```bash
# Provider selection
THREAT_HUNT_AGENT_PROVIDER=auto # auto, local, networked, or online
# Local provider
THREAT_HUNT_LOCAL_MODEL_PATH=/models/model.gguf
# Networked provider
THREAT_HUNT_NETWORKED_ENDPOINT=http://service:5000
THREAT_HUNT_NETWORKED_KEY=api-key
# Online provider
THREAT_HUNT_ONLINE_API_KEY=sk-key
THREAT_HUNT_ONLINE_PROVIDER=openai
THREAT_HUNT_ONLINE_MODEL=gpt-3.5-turbo
# Agent behavior
THREAT_HUNT_AGENT_MAX_TOKENS=1024
THREAT_HUNT_AGENT_REASONING=true
THREAT_HUNT_AGENT_HISTORY_LENGTH=10
THREAT_HUNT_AGENT_FILTER_SENSITIVE=true
# Frontend
REACT_APP_API_URL=http://localhost:8000
```
### Docker Deployment
Use `docker-compose.yml` for full stack deployment:
```bash
# Build and start services
docker-compose up -d
# Verify health
curl http://localhost:8000/api/agent/health
curl http://localhost:3000
# View logs
docker-compose logs -f backend
docker-compose logs -f frontend
# Stop services
docker-compose down
```
## Security Considerations
1. **API Access**: Backend should be protected with authentication in production
2. **LLM Privacy**: Sensitive data (IPs, usernames) should be filtered before sending to online providers
3. **Error Messages**: Production should use generic error messages, not expose internal details
4. **Rate Limiting**: Implement rate limiting on agent endpoints
5. **Conversation History**: Consider data retention policies for conversation logs
## Testing
### Manual Testing
1. **Agent Health**:
```bash
curl http://localhost:8000/api/agent/health
```
2. **Agent Assistance** (without frontend):
```bash
curl -X POST http://localhost:8000/api/agent/assist \
-H "Content-Type: application/json" \
-d '{
"query": "What suspicious patterns do you see?",
"dataset_name": "FileList",
"artifact_type": "FileList",
"host_identifier": "HOST123"
}'
```
3. **Frontend UI**:
- Navigate to http://localhost:3000
- Type question in agent panel
- Verify response displays correctly
## Future Enhancements
1. **Structured Output**: Use LLM JSON mode or function calling for more reliable parsing
2. **Context Filtering**: Automatically filter sensitive data before sending to LLM
3. **Multi-Modal**: Support image uploads (binary analysis, network diagrams)
4. **Caching**: Cache common agent responses to reduce latency
5. **Feedback Loop**: Capture analyst feedback on guidance quality
6. **Integration**: Connect agent to actual CVE databases, threat feeds
7. **Custom Models**: Support fine-tuned models for threat hunting domain
8. **Audit Trail**: Comprehensive logging of all agent interactions
## Governance Compliance
This implementation strictly follows:
- `goose-core/governance/AGENT_POLICY.md` - Agent boundaries and allowed functions
- `goose-core/governance/AI_RULES.md` - AI system rules
- `goose-core/governance/SCOPE.md` - Shared vs application-specific responsibility
- `ThreatHunt/THREATHUNT_INTENT.md` - Agent role in threat hunting
**Key Principles**:
- ✅ Agents assist analysts, never act autonomously
- ✅ No execution without explicit analyst approval
- ✅ No database or schema changes
- ✅ No alert escalation
- ✅ Read-only guidance
- ✅ Transparent reasoning and caveats
- ✅ Analyst retains all authority
## Troubleshooting
### Agent Unavailable (503)
- Check environment variables for provider configuration
- Verify LLM provider is accessible
- Review backend logs: `docker-compose logs backend`
### Slow Responses
- Check LLM provider latency
- Reduce MAX_TOKENS if appropriate
- Consider local provider for latency-sensitive deployments
### No Responses from Frontend
- Verify backend health: `curl http://localhost:8000/api/agent/health`
- Check browser console for errors
- Verify REACT_APP_API_URL in frontend environment
- Check CORS configuration if frontend hosted separately
## File Structure
```
ThreatHunt/
├── backend/
│ ├── app/
│ │ ├── agents/ # Agent module
│ │ │ ├── __init__.py
│ │ │ ├── core.py # ThreatHuntAgent class
│ │ │ ├── providers.py # LLM provider interface
│ │ │ └── config.py # Agent configuration
│ │ ├── api/
│ │ │ ├── routes/
│ │ │ │ ├── __init__.py
│ │ │ │ └── agent.py # /api/agent/* endpoints
│ │ ├── __init__.py
│ │ └── main.py # FastAPI app
│ ├── requirements.txt
│ └── run.py
├── frontend/
│ ├── src/
│ │ ├── components/
│ │ │ ├── AgentPanel.tsx # Agent chat component
│ │ │ └── AgentPanel.css
│ │ ├── utils/
│ │ │ └── agentApi.ts # API communication
│ │ ├── App.tsx # Main app with agent
│ │ ├── App.css
│ │ ├── index.tsx
│ ├── public/
│ │ └── index.html
│ ├── package.json
│ └── tsconfig.json
├── Dockerfile.backend
├── Dockerfile.frontend
├── docker-compose.yml
├── .env.example
├── AGENT_IMPLEMENTATION.md # This file
├── README.md
└── THREATHUNT_INTENT.md
```