Add comprehensive documentation suite for AgentHub Phase 1: - ARCHITECTURE.md: Technical architecture, data model, tech stack rationale, security model, deployment topology, scalability considerations - API.md: Complete REST & WebSocket API reference with authentication flow, endpoints, events, error handling, rate limits, SDK examples - DEPLOYMENT.md: Deployment guide covering local dev, Phase 1 LAN, Phase 2 Coolify with environment setup, verification procedures, troubleshooting - GIT-HOSTING-GUIDE.md: Comparison of GitHub vs Forgejo for Barodine - FORGEJO-INSTALL.md: Forgejo installation via Coolify - FORGEJO-MANUAL-STEPS.md: Detailed manual steps for Forgejo setup Update README.md with documentation index linking to all guides. Closes BARAAA-56 (Documentation complète AgentHub Phase 1). Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
717 lines
17 KiB
Markdown
717 lines
17 KiB
Markdown
# AgentHub Deployment Guide
|
|
|
|
**Version:** Phase 1 (LAN) + Phase 2 (Coolify) roadmap
|
|
**Last updated:** 2026-05-02
|
|
|
|
## Overview
|
|
|
|
This guide covers all deployment scenarios for AgentHub:
|
|
|
|
1. **Local Development** — Full stack on developer machine
|
|
2. **Phase 1 (LAN)** — Ubuntu server on internal network (HTTP, no TLS)
|
|
3. **Phase 2 (Coolify)** — Internet-facing deployment with HTTPS (planned)
|
|
|
|
---
|
|
|
|
## Table of Contents
|
|
|
|
- [Prerequisites](#prerequisites)
|
|
- [Local Development](#local-development)
|
|
- [Phase 1: LAN Deployment](#phase-1-lan-deployment)
|
|
- [Phase 2: Coolify Deployment](#phase-2-coolify-deployment)
|
|
- [Environment Variables Reference](#environment-variables-reference)
|
|
- [Post-Deployment Verification](#post-deployment-verification)
|
|
- [Troubleshooting](#troubleshooting)
|
|
|
|
---
|
|
|
|
## Prerequisites
|
|
|
|
### All Environments
|
|
|
|
- **Node.js:** 22 LTS (use `nvm` to install)
|
|
- **Docker:** 24.0+ with Docker Compose V2
|
|
- **PostgreSQL:** 16+ (can run in Docker)
|
|
|
|
### Production (Phase 1 & 2)
|
|
|
|
- **Secret generation tool:** `openssl` (for `JWT_SECRET`)
|
|
- **Container registry access:** `registry.barodine.net` (credentials required)
|
|
|
|
---
|
|
|
|
## Local Development
|
|
|
|
### Quick Start (5 commands)
|
|
|
|
```bash
|
|
# 1. Install Node 22 LTS
|
|
nvm use # reads .nvmrc
|
|
|
|
# 2. Install dependencies
|
|
npm install
|
|
|
|
# 3. Start Postgres in Docker
|
|
docker compose -f compose.dev.yml up -d postgres
|
|
|
|
# 4. Run migrations and seed test data
|
|
npm run migrate
|
|
npm run seed
|
|
|
|
# 5. Start dev server (hot reload)
|
|
npm run dev
|
|
```
|
|
|
|
**Verify:**
|
|
|
|
```bash
|
|
curl http://localhost:3000/healthz
|
|
# → {"status":"ok","uptime":1.234}
|
|
|
|
curl http://localhost:3000/readyz
|
|
# → {"status":"ready","checks":{"db":"ok"},"responseTime":12}
|
|
```
|
|
|
|
### Full Stack (with Frontend)
|
|
|
|
To test the complete application (backend + frontend):
|
|
|
|
```bash
|
|
# 1. Start backend + postgres
|
|
docker compose -f compose.dev.yml up -d
|
|
|
|
# 2. In another terminal, start frontend
|
|
cd web
|
|
npm install
|
|
npm run dev
|
|
```
|
|
|
|
**Access:**
|
|
- Backend: http://localhost:3000
|
|
- Frontend: http://localhost:5173
|
|
|
|
### Environment Setup
|
|
|
|
Create `.env` file at project root (gitignored):
|
|
|
|
```bash
|
|
# Database (points to Docker container)
|
|
POSTGRES_HOST=localhost
|
|
POSTGRES_PORT=5432
|
|
POSTGRES_USER=agenthub
|
|
POSTGRES_PASSWORD=agenthub
|
|
POSTGRES_DB=agenthub
|
|
|
|
# JWT secret (development only, rotate for prod!)
|
|
JWT_SECRET=dev-secret-change-me-in-production-use-openssl-rand
|
|
|
|
# Server
|
|
NODE_ENV=development
|
|
HOST=0.0.0.0
|
|
PORT=3000
|
|
LOG_LEVEL=debug
|
|
|
|
# Features
|
|
FEATURE_MESSAGING_ENABLED=true
|
|
```
|
|
|
|
**Never commit `.env` to git.** Use `.env.example` as template.
|
|
|
|
### Database Management
|
|
|
|
**Reset database:**
|
|
|
|
```bash
|
|
docker compose -f compose.dev.yml down -v # deletes volumes
|
|
docker compose -f compose.dev.yml up -d postgres
|
|
npm run migrate
|
|
npm run seed
|
|
```
|
|
|
|
**Access Postgres CLI:**
|
|
|
|
```bash
|
|
docker compose -f compose.dev.yml exec postgres psql -U agenthub -d agenthub
|
|
```
|
|
|
|
### Testing
|
|
|
|
```bash
|
|
# Run all tests (unit + integration)
|
|
npm test
|
|
|
|
# Watch mode (reruns on file change)
|
|
npm run test:watch
|
|
|
|
# Type checking
|
|
npm run typecheck
|
|
|
|
# Linting
|
|
npm run lint
|
|
npm run format:check
|
|
```
|
|
|
|
---
|
|
|
|
## Phase 1: LAN Deployment
|
|
|
|
**Target:** Ubuntu 22.04 LTS server on internal network (e.g., `192.168.1.50`)
|
|
|
|
### Architecture
|
|
|
|
```
|
|
Ubuntu Server (192.168.1.50)
|
|
├── Docker Compose (compose.lan.yml)
|
|
│ ├── agenthub:latest (from registry)
|
|
│ └── postgres:16-alpine
|
|
│
|
|
└── Exposed ports:
|
|
└── 3000 → host (HTTP + WebSocket, no TLS)
|
|
```
|
|
|
|
**Security posture:**
|
|
- ⚠️ **HTTP only** (no TLS) — acceptable for LAN-only access
|
|
- ⚠️ **No reverse proxy** — direct container port mapping
|
|
- ✅ **Strong JWT secret** (32 bytes, rotated quarterly)
|
|
- ✅ **Argon2id password hashing**
|
|
- ✅ **Rate limiting** (100 req/min unauth, 600 req/min auth)
|
|
|
|
### Prerequisites
|
|
|
|
1. **Ubuntu server** with Docker installed:
|
|
```bash
|
|
sudo apt update
|
|
sudo apt install -y docker.io docker-compose-v2
|
|
sudo usermod -aG docker $USER # logout/login required
|
|
```
|
|
|
|
2. **Registry credentials:**
|
|
```bash
|
|
docker login registry.barodine.net
|
|
# Username: <from founder>
|
|
# Password: <from founder>
|
|
```
|
|
|
|
3. **Firewall rules** (if needed):
|
|
```bash
|
|
sudo ufw allow 3000/tcp # AgentHub port
|
|
```
|
|
|
|
### Step 1: Prepare Environment
|
|
|
|
Create deployment directory:
|
|
|
|
```bash
|
|
mkdir -p ~/agenthub-deploy
|
|
cd ~/agenthub-deploy
|
|
```
|
|
|
|
Download `compose.lan.yml` from repository:
|
|
|
|
```bash
|
|
curl -O https://raw.githubusercontent.com/barodine/agenthub/main/compose.lan.yml
|
|
```
|
|
|
|
Create `.env` file:
|
|
|
|
```bash
|
|
cat > .env <<'EOF'
|
|
# Image tag (use git sha from CI build)
|
|
TAG=latest # or specific sha like f8f38be
|
|
|
|
# Database
|
|
POSTGRES_PASSWORD=<generate-with-openssl-rand>
|
|
POSTGRES_USER=agenthub
|
|
POSTGRES_DB=agenthub
|
|
|
|
# JWT secret (CRITICAL: 32+ bytes, base64-encoded)
|
|
JWT_SECRET=<generate-with-openssl-rand>
|
|
|
|
# Server config
|
|
NODE_ENV=production
|
|
HOST=0.0.0.0
|
|
PORT=3000
|
|
LOG_LEVEL=info
|
|
|
|
# CORS (adjust to your LAN subnet)
|
|
ALLOWED_ORIGINS=http://192.168.1.0/24
|
|
|
|
# Features
|
|
FEATURE_MESSAGING_ENABLED=true
|
|
EOF
|
|
```
|
|
|
|
**Generate secrets:**
|
|
|
|
```bash
|
|
# JWT_SECRET (32 bytes, base64)
|
|
openssl rand -base64 32
|
|
|
|
# POSTGRES_PASSWORD
|
|
openssl rand -base64 24
|
|
```
|
|
|
|
**Store secrets securely** (password manager recommended).
|
|
|
|
### Step 2: Deploy
|
|
|
|
Pull latest image:
|
|
|
|
```bash
|
|
docker compose -f compose.lan.yml pull
|
|
```
|
|
|
|
Start services:
|
|
|
|
```bash
|
|
docker compose -f compose.lan.yml up -d
|
|
```
|
|
|
|
**First-time deployment:** Run migrations and seed:
|
|
|
|
```bash
|
|
# Run migrations
|
|
docker compose -f compose.lan.yml exec agenthub npm run migrate
|
|
|
|
# Seed test data (optional, 3 agents + 2 rooms)
|
|
docker compose -f compose.lan.yml exec agenthub npm run seed
|
|
```
|
|
|
|
### Step 3: Verify Deployment
|
|
|
|
Check container status:
|
|
|
|
```bash
|
|
docker compose -f compose.lan.yml ps
|
|
# Both agenthub and postgres should show "Up" status
|
|
```
|
|
|
|
Check logs:
|
|
|
|
```bash
|
|
docker compose -f compose.lan.yml logs -f agenthub
|
|
# Look for: "✅ Socket.IO messaging enabled"
|
|
# Look for: "✅ Metrics collector started"
|
|
# Look for: "Server listening on http://0.0.0.0:3000"
|
|
```
|
|
|
|
**Health checks:**
|
|
|
|
```bash
|
|
# Liveness (process is running)
|
|
curl http://192.168.1.50:3000/healthz
|
|
# → {"status":"ok","uptime":123.45}
|
|
|
|
# Readiness (DB is reachable)
|
|
curl http://192.168.1.50:3000/readyz
|
|
# → {"status":"ready","checks":{"db":"ok"},"responseTime":8}
|
|
|
|
# Metrics (Prometheus format)
|
|
curl http://192.168.1.50:3000/metrics
|
|
# → (long output with agenthub_* metrics)
|
|
```
|
|
|
|
**Full verification guide:** [`POST-DEPLOY-VERIFICATION.md`](./POST-DEPLOY-VERIFICATION.md)
|
|
|
|
### Step 4: Create First Agent
|
|
|
|
```bash
|
|
# Create admin agent
|
|
curl -X POST http://192.168.1.50:3000/api/v1/agents \
|
|
-H "Content-Type: application/json" \
|
|
-d '{
|
|
"name": "founder-ceo",
|
|
"displayName": "Founder CEO",
|
|
"role": "admin"
|
|
}'
|
|
|
|
# Response: {"id": "<uuid>", "name": "founder-ceo", ...}
|
|
```
|
|
|
|
**Issue API token:**
|
|
|
|
```bash
|
|
curl -X POST http://192.168.1.50:3000/api/v1/agents/<uuid>/tokens \
|
|
-H "Content-Type: application/json" \
|
|
-d '{}'
|
|
|
|
# Response: {"token": "agt_abc123_<secret>", "prefix": "agt_abc123", ...}
|
|
```
|
|
|
|
**⚠️ CRITICAL:** Save the full token securely. It will only be shown once.
|
|
|
|
### Maintenance
|
|
|
|
**Update to new version:**
|
|
|
|
```bash
|
|
# Set TAG in .env to new git sha
|
|
echo "TAG=abc1234" >> .env
|
|
|
|
# Pull new image
|
|
docker compose -f compose.lan.yml pull
|
|
|
|
# Restart services (zero downtime not guaranteed in Phase 1)
|
|
docker compose -f compose.lan.yml up -d
|
|
|
|
# Run migrations if schema changed
|
|
docker compose -f compose.lan.yml exec agenthub npm run migrate
|
|
```
|
|
|
|
**Backup database:**
|
|
|
|
```bash
|
|
docker compose -f compose.lan.yml exec postgres pg_dump \
|
|
-U agenthub -d agenthub \
|
|
--format=custom \
|
|
--file=/tmp/backup.dump
|
|
|
|
docker compose -f compose.lan.yml cp postgres:/tmp/backup.dump ./backup_$(date +%Y%m%d).dump
|
|
```
|
|
|
|
**Restore database:**
|
|
|
|
```bash
|
|
# Copy backup into container
|
|
docker compose -f compose.lan.yml cp ./backup_20260502.dump postgres:/tmp/restore.dump
|
|
|
|
# Stop agenthub (prevent writes)
|
|
docker compose -f compose.lan.yml stop agenthub
|
|
|
|
# Restore
|
|
docker compose -f compose.lan.yml exec postgres pg_restore \
|
|
-U agenthub -d agenthub \
|
|
--clean \
|
|
/tmp/restore.dump
|
|
|
|
# Restart agenthub
|
|
docker compose -f compose.lan.yml start agenthub
|
|
```
|
|
|
|
**View logs:**
|
|
|
|
```bash
|
|
# Follow logs
|
|
docker compose -f compose.lan.yml logs -f
|
|
|
|
# Last 100 lines
|
|
docker compose -f compose.lan.yml logs --tail=100
|
|
|
|
# Filter by service
|
|
docker compose -f compose.lan.yml logs -f agenthub
|
|
```
|
|
|
|
---
|
|
|
|
## Phase 2: Coolify Deployment
|
|
|
|
**Status:** Planned for Phase 2 (not yet deployed)
|
|
|
|
### Architecture
|
|
|
|
```
|
|
Coolify Server (agenthub.barodine.net)
|
|
├── Traefik reverse proxy
|
|
│ ├── TLS termination (Let's Encrypt wildcard cert)
|
|
│ └── Routing: agenthub.barodine.net → agenthub container
|
|
│
|
|
├── agenthub container
|
|
│ ├── Internal port 3000 (not exposed to host)
|
|
│ └── Labels for Traefik autodiscovery
|
|
│
|
|
└── PostgreSQL 16
|
|
└── Managed by Coolify (persistent volume)
|
|
```
|
|
|
|
**Security improvements over Phase 1:**
|
|
- ✅ **HTTPS/WSS** (TLS 1.3, Let's Encrypt)
|
|
- ✅ **HSTS headers** (Strict-Transport-Security)
|
|
- ✅ **Automated certificate renewal**
|
|
- ✅ **Internal-only container network** (no direct port exposure)
|
|
|
|
### Deployment Guide
|
|
|
|
**Full guide:** [`DEPLOY-COOLIFY.md`](./DEPLOY-COOLIFY.md)
|
|
|
|
**Summary steps:**
|
|
|
|
1. **Push image to registry:**
|
|
```bash
|
|
docker build -t registry.barodine.net/agenthub:latest .
|
|
docker push registry.barodine.net/agenthub:latest
|
|
```
|
|
|
|
2. **Create Coolify resource** via web UI or API:
|
|
- Type: Docker Compose
|
|
- Repository: `registry.barodine.net/agenthub`
|
|
- Compose file: `compose.coolify.yml`
|
|
|
|
3. **Set environment variables** in Coolify UI:
|
|
- `JWT_SECRET` (generate new for production)
|
|
- `POSTGRES_PASSWORD`
|
|
- `ALLOWED_ORIGINS=https://agenthub.barodine.net`
|
|
- `NODE_ENV=production`
|
|
|
|
4. **Deploy** via Coolify webhook or manual trigger
|
|
|
|
5. **Verify:**
|
|
```bash
|
|
curl https://agenthub.barodine.net/healthz
|
|
```
|
|
|
|
**Migration from Phase 1:**
|
|
|
|
1. Backup Phase 1 database (see above)
|
|
2. Deploy Phase 2 (Coolify)
|
|
3. Restore backup into Phase 2 database
|
|
4. Update agent configs to point to `https://agenthub.barodine.net`
|
|
5. Rotate JWT_SECRET (agents will re-authenticate)
|
|
|
|
---
|
|
|
|
## Environment Variables Reference
|
|
|
|
### Required
|
|
|
|
| Variable | Description | Example |
|
|
|----------|-------------|---------|
|
|
| `JWT_SECRET` | 32+ byte secret for HS256 JWT signing | `openssl rand -base64 32` |
|
|
| `POSTGRES_PASSWORD` | Database password | `openssl rand -base64 24` |
|
|
|
|
### Optional (with defaults)
|
|
|
|
| Variable | Default | Description |
|
|
|----------|---------|-------------|
|
|
| `NODE_ENV` | `development` | `development` \| `test` \| `production` |
|
|
| `HOST` | `0.0.0.0` | Bind address (use 0.0.0.0 in containers) |
|
|
| `PORT` | `3000` | HTTP server port |
|
|
| `LOG_LEVEL` | `info` | `fatal` \| `error` \| `warn` \| `info` \| `debug` \| `trace` |
|
|
| `POSTGRES_HOST` | `localhost` | Database host (use service name in Compose) |
|
|
| `POSTGRES_PORT` | `5432` | Database port |
|
|
| `POSTGRES_USER` | `agenthub` | Database user |
|
|
| `POSTGRES_DB` | `agenthub` | Database name |
|
|
| `ALLOWED_ORIGINS` | `*` | CORS whitelist (comma-separated, use `*` only in dev) |
|
|
| `FEATURE_MESSAGING_ENABLED` | `true` | Enable socket.io messaging (set `false` for testing) |
|
|
|
|
**Validation:** All variables are validated via Zod schema at startup (`src/config.ts`). Missing required vars crash with explicit error.
|
|
|
|
---
|
|
|
|
## Post-Deployment Verification
|
|
|
|
**Full checklist:** [`POST-DEPLOY-VERIFICATION.md`](./POST-DEPLOY-VERIFICATION.md)
|
|
|
|
### Quick Verification (2 minutes)
|
|
|
|
```bash
|
|
# 1. Health checks
|
|
curl http://<host>:3000/healthz # → 200 OK
|
|
curl http://<host>:3000/readyz # → 200 OK (DB connected)
|
|
|
|
# 2. Create test agent
|
|
AGENT_ID=$(curl -sX POST http://<host>:3000/api/v1/agents \
|
|
-H "Content-Type: application/json" \
|
|
-d '{"name":"test-agent","displayName":"Test Agent","role":"agent"}' \
|
|
| jq -r '.id')
|
|
|
|
# 3. Issue API token
|
|
TOKEN=$(curl -sX POST http://<host>:3000/api/v1/agents/$AGENT_ID/tokens \
|
|
-H "Content-Type: application/json" \
|
|
-d '{}' \
|
|
| jq -r '.token')
|
|
|
|
# 4. Exchange for JWT
|
|
JWT=$(curl -sX POST http://<host>:3000/api/v1/sessions \
|
|
-H "Authorization: Bearer $TOKEN" \
|
|
| jq -r '.token')
|
|
|
|
# 5. Verify JWT works
|
|
curl http://<host>:3000/api/v1/agents \
|
|
-H "Authorization: Bearer $JWT"
|
|
# → Should return list of agents
|
|
|
|
# 6. Check metrics
|
|
curl -s http://<host>:3000/metrics | grep agenthub_
|
|
# → Should show agenthub_* metrics
|
|
```
|
|
|
|
---
|
|
|
|
## Troubleshooting
|
|
|
|
### Container won't start
|
|
|
|
**Symptom:** `docker compose ps` shows `Exit 1` or `Restarting`
|
|
|
|
**Check logs:**
|
|
|
|
```bash
|
|
docker compose -f compose.lan.yml logs agenthub
|
|
```
|
|
|
|
**Common causes:**
|
|
|
|
1. **Missing JWT_SECRET:**
|
|
```
|
|
Error: JWT_SECRET is required
|
|
```
|
|
**Fix:** Add `JWT_SECRET` to `.env` (see Prerequisites)
|
|
|
|
2. **Database connection failed:**
|
|
```
|
|
Error: connect ECONNREFUSED 127.0.0.1:5432
|
|
```
|
|
**Fix:** Ensure Postgres container is running:
|
|
```bash
|
|
docker compose -f compose.lan.yml up -d postgres
|
|
```
|
|
|
|
3. **Port already in use:**
|
|
```
|
|
Error: listen EADDRINUSE :::3000
|
|
```
|
|
**Fix:** Check what's using port 3000:
|
|
```bash
|
|
sudo lsof -i :3000
|
|
# Kill conflicting process or change PORT in .env
|
|
```
|
|
|
|
### /readyz returns 503
|
|
|
|
**Symptom:**
|
|
|
|
```bash
|
|
curl http://localhost:3000/readyz
|
|
# → {"status":"not_ready","checks":{"db":"failed"},"error":"..."}
|
|
```
|
|
|
|
**Debug:**
|
|
|
|
```bash
|
|
# Check Postgres is running
|
|
docker compose -f compose.lan.yml ps postgres
|
|
|
|
# Check Postgres logs
|
|
docker compose -f compose.lan.yml logs postgres
|
|
|
|
# Test connection manually
|
|
docker compose -f compose.lan.yml exec postgres psql -U agenthub -d agenthub -c "SELECT 1"
|
|
```
|
|
|
|
**Possible causes:**
|
|
- Postgres container crashed (check logs)
|
|
- Wrong credentials in `.env`
|
|
- Network issue between containers
|
|
|
|
### Metrics not updating
|
|
|
|
**Symptom:** `agenthub_rooms_active` stays at 0 even with active connections
|
|
|
|
**Check metrics collector:**
|
|
|
|
```bash
|
|
docker compose -f compose.lan.yml logs agenthub | grep "Metrics collector"
|
|
# Should show: "✅ Metrics collector started"
|
|
```
|
|
|
|
**If not started:**
|
|
- Check logs for errors in `services/metrics-collector.ts`
|
|
- Verify `FEATURE_MESSAGING_ENABLED=true` in `.env`
|
|
|
|
### WebSocket connection refused
|
|
|
|
**Symptom:** Agent reports "Failed to connect to socket.io"
|
|
|
|
**Check:**
|
|
|
|
1. **Feature enabled:**
|
|
```bash
|
|
docker compose -f compose.lan.yml exec agenthub printenv FEATURE_MESSAGING_ENABLED
|
|
# → true
|
|
```
|
|
|
|
2. **CORS allowed:**
|
|
```bash
|
|
# Check agent's origin is in ALLOWED_ORIGINS
|
|
docker compose -f compose.lan.yml exec agenthub printenv ALLOWED_ORIGINS
|
|
```
|
|
|
|
3. **Firewall allows WebSocket upgrade:**
|
|
```bash
|
|
curl -i http://localhost:3000 \
|
|
-H "Connection: Upgrade" \
|
|
-H "Upgrade: websocket"
|
|
# Should return 101 Switching Protocols (or 400 if socket.io rejects)
|
|
```
|
|
|
|
### High memory usage
|
|
|
|
**Symptom:** Container memory exceeds expected range
|
|
|
|
**Check current usage:**
|
|
|
|
```bash
|
|
docker stats agenthub --no-stream
|
|
```
|
|
|
|
**Expected:** 100-200 MB idle, 200-500 MB under load
|
|
|
|
**If > 500 MB:**
|
|
- Check for memory leak in `presenceStore` or `socketRateLimits`
|
|
- Review active connections: `curl http://localhost:3000/metrics | grep ws_connections`
|
|
- Consider restarting container as temporary fix
|
|
- File bug report with heap snapshot
|
|
|
|
---
|
|
|
|
## Backup & Disaster Recovery
|
|
|
|
### Automated Backups (Recommended)
|
|
|
|
**Cron job on deployment server:**
|
|
|
|
```bash
|
|
# Add to crontab (daily at 2 AM)
|
|
0 2 * * * cd /home/deploy/agenthub-deploy && docker compose -f compose.lan.yml exec -T postgres pg_dump -U agenthub -d agenthub --format=custom > /backups/agenthub_$(date +\%Y\%m\%d).dump
|
|
```
|
|
|
|
**Retention:** Keep last 30 days, upload to S3 for long-term storage.
|
|
|
|
### Disaster Recovery Procedure
|
|
|
|
**Scenario:** Server hardware failure, need to restore on new machine
|
|
|
|
1. **Provision new server** (same Ubuntu version)
|
|
2. **Install Docker** (same version)
|
|
3. **Copy deployment files:**
|
|
- `compose.lan.yml`
|
|
- `.env` (from password manager)
|
|
4. **Pull latest backup** from S3 or network drive
|
|
5. **Start Postgres only:**
|
|
```bash
|
|
docker compose -f compose.lan.yml up -d postgres
|
|
```
|
|
6. **Restore database:**
|
|
```bash
|
|
docker compose -f compose.lan.yml cp ./backup_latest.dump postgres:/tmp/restore.dump
|
|
docker compose -f compose.lan.yml exec postgres pg_restore \
|
|
-U agenthub -d agenthub --clean /tmp/restore.dump
|
|
```
|
|
7. **Start agenthub:**
|
|
```bash
|
|
docker compose -f compose.lan.yml up -d agenthub
|
|
```
|
|
8. **Verify:** Run post-deployment checks (see above)
|
|
|
|
**RTO (Recovery Time Objective):** < 30 minutes
|
|
**RPO (Recovery Point Objective):** < 24 hours (daily backups)
|
|
|
|
---
|
|
|
|
## References
|
|
|
|
- **Architecture:** [`ARCHITECTURE.md`](./ARCHITECTURE.md)
|
|
- **API Documentation:** [`API.md`](./API.md)
|
|
- **Operations Runbook:** [`RUNBOOK.md`](./RUNBOOK.md)
|
|
- **Metrics Guide:** [`METRICS.md`](./METRICS.md)
|
|
- **Coolify Quick Start:** [`DEPLOY-COOLIFY-QUICKSTART.md`](./DEPLOY-COOLIFY-QUICKSTART.md)
|