Files
Dashboard/docs/MONITORING.md

6.5 KiB

Monitoring & Maintenance Guide

Container Health Monitoring

Check Container Status

# SSH into Atlas server
ssh soadmin@100.104.196.38

# Check if running
docker-compose -C /opt/dashboard ps

# View live logs
docker-compose -C /opt/dashboard logs -f

# Check resource usage
docker stats atlas-dashboard

Health Check

The container includes a health check that runs every 30 seconds:

# Check health status
docker inspect atlas-dashboard | grep -A 5 Health

Performance Monitoring

Memory & CPU Usage

# Monitor in real-time
docker stats atlas-dashboard

# View historical stats
docker stats atlas-dashboard --no-stream

Recommended limits:

  • CPU: 1 core
  • Memory: 512MB
  • The container typically uses 200-300MB at runtime

Log Analysis

# View recent errors
docker-compose -C /opt/dashboard logs --tail=50 | grep -i error

# Follow logs in real-time
docker-compose -C /opt/dashboard logs -f

Backup & Recovery

Database Backups

Since this dashboard doesn't use a persistent database (it's stateless), no database backups are needed.

Configuration Backup

# Backup .env.local
ssh soadmin@100.104.196.38 "cp /opt/dashboard/.env.local /opt/dashboard/.env.local.backup"

# Backup compose file
ssh soadmin@100.104.196.38 "cp /opt/dashboard/docker-compose.yml /opt/dashboard/docker-compose.yml.backup"

Container Image Backup

# Save Docker image locally
ssh soadmin@100.104.196.38 "docker save atlas-dashboard:latest | gzip > atlas-dashboard-backup.tar.gz"

# Download backup
scp soadmin@100.104.196.38:/home/soadmin/atlas-dashboard-backup.tar.gz .

Maintenance Tasks

Weekly Tasks

  • Check container logs for errors
  • Verify all widgets are loading correctly
  • Monitor memory/CPU usage
  • Test external service connectivity

Monthly Tasks

  • Update base Docker image: docker-compose build --pull
  • Check for upstream code updates: git fetch && git log --oneline -5 origin/main
  • Review and test backup procedures

Quarterly Tasks

  • Update Node.js base image version
  • Review and update dependencies
  • Security audit of credentials/config
  • Performance review and optimization

Updating the Dashboard

Automated Updates (GitHub Actions)

Pushes to main branch automatically deploy to Atlas server if GitHub Actions secrets are configured:

  1. Set up GitHub Actions secrets:

    • ATLAS_HOST - 100.104.196.38
    • ATLAS_USER - soadmin
    • ATLAS_SSH_KEY - SSH private key for automated access
  2. Push to main:

    git push origin main
    

Manual Updates

ssh soadmin@100.104.196.38

cd /opt/dashboard

# Pull latest code
git pull origin main

# Rebuild and restart
docker-compose build
docker-compose up -d

# Verify
docker-compose ps

Scaling Considerations

The dashboard is designed as a single-instance application. For high-availability setups:

Load Balancing

Add a reverse proxy (Traefik, Nginx):

upstream dashboard {
    server 100.104.196.38:3001;
}

server {
    listen 80;
    server_name dashboard.yourdomain.com;
    
    location / {
        proxy_pass http://dashboard;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
}

Multiple Instances

To run multiple instances:

docker-compose -p dashboard-1 up -d
docker-compose -p dashboard-2 up -d

# Use different ports
# Modify docker-compose.yml to use different port mappings

Disaster Recovery

Complete Loss Scenario

If the container is completely lost:

# 1. SSH into server
ssh soadmin@100.104.196.38

# 2. Restore from backup
cd /opt/dashboard
git clone https://github.com/mblanke/Dashboard.git .
cp /opt/dashboard/.env.local.backup .env.local

# 3. Redeploy
docker-compose build
docker-compose up -d

# 4. Verify
docker-compose ps
curl http://localhost:3001

Server Loss Scenario

To migrate to a new server:

# 1. On new server
ssh soadmin@NEW_IP

# 2. Set up (same as initial deployment)
mkdir -p /opt/dashboard
cd /opt/dashboard
git clone https://github.com/mblanke/Dashboard.git .
cp .env.local.backup .env.local  # Use backed up config

# 3. Deploy
docker-compose build
docker-compose up -d

# 4. Update DNS or references to point to new IP

Troubleshooting Common Issues

Container keeps restarting

# Check logs for errors
docker-compose logs dashboard

# Common causes:
# - Missing .env.local file
# - Invalid environment variables
# - Port 3001 already in use
# - Out of disk space

Memory leaks

# Monitor memory over time
while true; do
  echo "$(date): $(docker stats atlas-dashboard --no-stream | tail -1)"
  sleep 60
done

# If memory usage keeps growing, restart container
docker-compose restart dashboard

API connection failures

# Check Docker API
curl http://100.104.196.38:2375/containers/json

# Check UniFi
curl -k https://UNIFI_IP:8443/api/login -X POST \
  -d '{"username":"admin","password":"password"}'

# Check Synology
curl -k https://SYNOLOGY_IP:5001/webapi/auth.cgi

Performance Optimization

Caching

The dashboard auto-refreshes every 10 seconds. To optimize:

# Increase refresh interval in src/app/page.tsx
const interval = setInterval(fetchContainers, 30000); // 30 seconds

Database Queries

External API calls are read-only and lightweight. No optimization needed unless:

  • API responses are very large (>5MB)
  • Network latency is high (>1000ms)

Then consider adding response caching in API routes:

// Add to route handlers
res.setHeader('Cache-Control', 'max-age=10, s-maxage=60');

Support & Debugging

Collecting Debug Information

For troubleshooting, gather:

# System info
docker --version
docker-compose --version
uname -a

# Container info
docker inspect atlas-dashboard

# Recent logs (last 100 lines)
docker-compose logs --tail=100

# Resource usage
docker stats atlas-dashboard --no-stream

# Network connectivity
curl -v http://100.104.196.38:2375/containers/json

Getting Help

When reporting issues, include:

  1. Output from above debug commands
  2. Exact error messages from logs
  3. Steps to reproduce
  4. Environment configuration (without passwords)
  5. Timeline of when issue started