Administrator Guide¶
This guide covers operations and maintenance of a deployed SPOT platform. The install procedure itself ; prerequisites, DNS, secrets, admin bootstrap, plugin installation ; lives in a single place:
This page picks up after the stack is running. For configuration options that the deploy .env does not cover, see the Configuration reference.
Hardening¶
SSL/TLS Configuration¶
Configure reverse proxy (nginx) for HTTPS:
server {
listen 443 ssl http2;
server_name spot-api.company.com;
ssl_certificate /etc/ssl/certs/spot-api.crt;
ssl_certificate_key /etc/ssl/private/spot-api.key;
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers ECDHE-RSA-AES256-GCM-SHA512:DHE-RSA-AES256-GCM-SHA512;
location / {
proxy_pass http://localhost:8001;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
}
Firewall Configuration¶
# Allow HTTPS
ufw allow 443/tcp
# Allow SSH (restrict to management IPs)
ufw allow from 192.168.1.0/24 to any port 22 proto tcp
# Block direct access to internal ports
ufw deny 8001/tcp # API Gateway
ufw deny 5432/tcp # PostgreSQL
ufw deny 6379/tcp # Redis
ufw deny 5672/tcp # RabbitMQ
# Enable firewall
ufw enable
Monitoring¶
Health Checks¶
# Overall platform health
curl https://spot-api.company.com/health
# Service status
docker compose ps
Key Metrics¶
Monitor these metrics for production:
API Metrics:
- Response time: p95 < 2s, p99 < 5s
- Error rate: < 1%
- Request rate: baseline + anomaly alerts
- Authentication success: > 99%
Infrastructure Metrics:
- CPU usage: < 70% average
- Memory usage: < 80% average
- Disk usage: < 80%
- Network latency: < 100ms
Application Metrics:
- Analysis queue depth
- Analysis success rate: > 95%
- Average analysis time: < 30s
Logging¶
View logs:
Log configuration (docker-compose.prod.yml):
Knowledge Store¶
SPOT's RAG layer is a dedicated microservice (spot-knowledge) with its own Postgres table (knowledge_documents, migration 008) and pgvector HNSW index. See the Knowledge Store guide for concepts; this section covers the day-to-day admin surface.
Dashboard¶
Admins get a Knowledge top-level nav item:
/knowledge; total doc count, distinct sources, tag breakdown, semantic search, and a filterable/paginated browse table with inline delete./knowledge/{id}; single document inspector.
From any context-provider detail page (/plugins/context_provider/{id}) operators can:
- Trigger a one-shot sync (
Sync nowbutton). - Jump to the Knowledge Store filtered by this provider's documents.
- See the last-run timestamp, doc count, and next scheduled run.
Editing ingestion schedules¶
Go to /config/plugin/context_provider/{id} → Scheduling card. Edit the 5-field cron (or @hourly/@daily/@weekly/@monthly alias) and the timeout (ms), then Save. The running scheduler reconciles immediately ; no process restart required. Leaving the schedule empty makes the provider manual-only.
Invalid cron expressions are rejected by the scheduler and surfaced on the same page ("Invalid cron expression") so they're visible before the next firing.
Required environment¶
Both the knowledge service and the api-gateway need SPOT_INTERNAL_API_KEY set in .env. The installer stamps the same key on every installed plugin so providers can write to the store. Losing the key means every sync returns 401; rotating it requires docker compose restart.
Backup and Recovery¶
Easy Database Backup¶
Create automated backup script:
#!/bin/bash
BACKUP_DIR="/var/backups/spot"
DATE=$(date +%Y%m%d_%H%M%S)
# PostgreSQL backup
docker exec spot-postgres pg_dump -U spot spot | gzip > "$BACKUP_DIR/spot_db_$DATE.sql.gz"
# Redis backup
docker exec spot-redis redis-cli --rdb /data/dump.rdb
docker cp spot-redis:/data/dump.rdb "$BACKUP_DIR/spot_redis_$DATE.rdb"
# Keep 7 daily backups
find "$BACKUP_DIR" -name "spot_db_*.sql.gz" -mtime +7 -delete
Schedule with cron:
Easy Recovery¶
Restore database:
# Stop services
docker compose down
# Restore PostgreSQL
gunzip -c /var/backups/spot/spot_db_20240101_120000.sql.gz | \
docker exec -i spot-postgres psql -U spot -d spot
# Restart services
docker compose up -d
Security¶
Best Practices¶
- Use strong passwords (32+ characters)
- Enable HTTPS only (no HTTP)
- Implement token rotation
- Run containers as non-root
- Regular security updates
- Monitor authentication logs
Access Control¶
Configure OAuth2 properly:
- Use strong SECRET_KEY
- Set appropriate token expiry
- Implement rate limiting
- Log all authentication attempts
Network Security¶
- Use firewall rules (see above)
- Implement network segmentation
- Use private Docker networks
- Restrict management access
Troubleshooting¶
High Response Times¶
# Check resources
docker stats
# Check database
docker exec spot-postgres psql -U spot -d spot -c \
"SELECT * FROM pg_stat_activity WHERE state = 'active';"
# Check queue depth
docker compose ps
Service Unavailable¶
# Check status
docker compose ps
# Check connectivity
telnet localhost 5432 # PostgreSQL
telnet localhost 6379 # Redis
telnet localhost 5672 # RabbitMQ
# Restart services
docker compose restart
Authentication Failures¶
# Check API Gateway logs
docker compose logs api-gateway | grep -i auth
# Verify environment
grep SECRET_KEY .env
# Test authentication
curl -X POST http://localhost:8001/auth/token \
-d "username=test&password=test&grant_type=password"
Database Issues¶
# Check connections
docker exec spot-postgres psql -U spot -d spot -c \
"SELECT count(*) FROM pg_stat_activity;"
# Check slow queries
docker exec spot-postgres psql -U spot -d spot -c \
"SELECT query, mean_time, calls FROM pg_stat_statements
ORDER BY mean_time DESC LIMIT 10;"
Maintenance¶
Updates¶
Update SPOT Platform:
# Pull the latest deploy revision (image tags are pinned in .env)
git pull origin main
# Pull the matching service images
docker compose pull
# Recreate services with the new images
docker compose up -d
# Verify health
docker compose ps
Scaling¶
Increase worker processes:
# Edit .env
UVICORN_WORKERS=4
DATABASE_POOL_SIZE=20
REDIS_CONNECTION_POOL_SIZE=50
# Restart
docker compose restart
Environment Management¶
Switch environments:
See Environment Configuration for details.
Operator commands reference¶
Day-to-day operations on a deployed stack go through docker compose from the deploy checkout, plus the operator spot-cli for things that hit the API:
# Stack lifecycle
docker compose up -d # Start (or recreate) all services
docker compose down # Stop and remove containers
docker compose restart # Restart services
docker compose ps # Show container status
# Logs
docker compose logs -f # All services, follow
docker compose logs -f api-gateway # One service, follow
# Updates
docker compose pull && docker compose up -d # Rolling update to new image tags
# Operator CLI (auth, plugins, workflows, analyze, config, benchmark)
spot-cli --help
The operator CLI is documented in the CLI user guide.
Support¶
- Documentation: https://spot-project.codeberg.page/documentation/
- Issues: https://codeberg.org/SPOT_Project/core/-/issues
- Email: spot@sonn.lu