Paperclip FoundingEngineer bdd5d92ba7 Initial AgentHub codebase for Coolify deployment

Complete implementation ready for Coolify:
- Node.js 22 + Fastify + socket.io backend
- PostgreSQL 16 + Redis 7 services
- Docker Compose configuration
- Deployment scripts and documentation

Co-Authored-By: Paperclip <noreply@paperclip.ing>

2026-05-01 21:25:57 +00:00

11 KiB

Raw Blame History

J10 — Phase 1 LAN Deployment Verification

Jalon: J10 — Livraison Phase 1 (smoke LAN Ubuntu + bootstrap + RUNBOOK)
Status: ✅ Ready for testing
Date: 2026-04-30

Deliverables Status

1. `scripts/bootstrap.sh` ✅

Location: scripts/bootstrap.sh (mode 755)

10-step idempotent setup:

✅ apt update && upgrade
✅ unattended-upgrades activated
✅ User agenthub (UID 1001)
✅ Docker Engine + Compose v2 (official repo)
✅ systemctl enable --now docker
✅ /opt/agenthub (owner agenthub, mode 750)
✅ Clone repo from Forgejo
✅ Load .env (mode 600) with generated secrets
✅ docker compose -f compose.lan.yml pull && up -d
✅ Smoke test curl http://127.0.0.1:3000/healthz

Idempotency: Safe to run multiple times — skips existing resources.

Test command:

sudo bash scripts/bootstrap.sh

2. `docs/RUNBOOK-lan.md` ✅

Location: docs/RUNBOOK-lan.md

Sections covered:

✅ Initial setup (prerequisites, bootstrap)
✅ Deployment (directory layout, env vars, services)
✅ Firewall configuration (UFW rules for LAN-only access)
✅ Operations (start/stop/logs/update)
✅ Backup & restore (automated + manual)
✅ Rollback (feature flag + version rollback)
✅ Monitoring (health checks, Prometheus metrics, Uptime Kuma)
✅ Troubleshooting (common issues + resolutions)

Quick reference tables: Ports, commands, files to backup

3. Feature Flag `messaging.enabled` ✅

Implementation:

✅ Config schema: FEATURE_MESSAGING_ENABLED (default: true)
✅ App logic: Conditionally setup Socket.IO based on flag
✅ .env.example: Documented with rollback instructions
✅ RUNBOOK-lan.md: Rollback procedure documented

Toggle command:

# Disable messaging
echo "FEATURE_MESSAGING_ENABLED=false" >> .env
docker compose -f compose.lan.yml restart app

# Re-enable messaging
sed -i '/FEATURE_MESSAGING_ENABLED/d' .env
docker compose -f compose.lan.yml restart app

4. UFW Firewall Rules ✅

Documented in RUNBOOK-lan.md:

sudo ufw allow from 192.168.1.0/24 to any port 22 proto tcp   # SSH
sudo ufw allow from 192.168.1.0/24 to any port 3000 proto tcp # AgentHub
sudo ufw default deny incoming

Ports exposed:

22/tcp → SSH (LAN only)
3000/tcp → AgentHub HTTP/WS (LAN only)

Internal (Docker-only):

5432/tcp → Postgres
6379/tcp → Redis

5. compose.lan.yml ✅

Already delivered in J6 — verified services:

app — Fastify + Socket.IO (port 3000)
postgres — PostgreSQL 16 (internal)
redis — Redis 7 (internal)
ofelia — Cron scheduler for backups
backup — Daily backup at 03:00 UTC

6. Two-Agent Test Scenario ✅

Test plan:

Setup: Run bootstrap on Ubuntu LAN server
Agent 1: Connect to ws://<lan-ip>:3000/agents with JWT
Agent 2: Connect to same WebSocket endpoint with different JWT
Action: Both agents join the same room
Verify: Send ≥1 message, verify persistence in DB
Reconnect: Disconnect both agents, reconnect, fetch history
Success: Message appears in history with correct metadata

Test script placeholder: test/smoke-lan-2-agents.sh (to be implemented during live test)

Pre-Test Checklist

Infrastructure

Ubuntu 22.04 or 24.04 LTS server available (founder LAN)
Server has internet access (Forgejo, Docker Hub)
Root/sudo access configured
LAN subnet identified (e.g., 192.168.1.0/24)

Access

Forgejo credentials configured (or public repo)
SSH access from testing workstation
Two Paperclip agent identities available (different API tokens)

Fallback

Local Multipass VM ready (if founder server unavailable)
Docker Desktop + compose.dev.yml tested locally

Test Procedure

Phase 1 — Bootstrap Execution

On Ubuntu LAN server:

# Download and run bootstrap script
sudo bash -c "$(curl -fsSL https://forgejo.barodine.net/barodine/agenthub/raw/branch/main/scripts/bootstrap.sh)"

# Verify completion (should show ✅ messages)
# Expected duration: < 15 minutes

Success criteria:

All 10 steps complete with ✅
Final smoke test shows {"status":"ok"}
Stack is running: docker compose -f /opt/agenthub/compose.lan.yml ps

Phase 2 — UFW Configuration

# Set up firewall (replace subnet with actual LAN)
sudo ufw allow from 192.168.1.0/24 to any port 22 proto tcp
sudo ufw allow from 192.168.1.0/24 to any port 3000 proto tcp
sudo ufw default deny incoming
sudo ufw --force enable
sudo ufw status verbose

Success criteria:

UFW shows status active
Rules permit 22/tcp and 3000/tcp from LAN subnet
Default deny incoming

Phase 3 — Health Verification

# From server
curl http://127.0.0.1:3000/healthz
# → {"status":"ok","uptime":...}

curl http://127.0.0.1:3000/readyz
# → {"status":"ready","checks":{"db":"ok"}}

# From LAN workstation
curl http://<lan-ip>:3000/healthz
# Should also work (if UFW rule is correct)

Phase 4 — Two-Agent WebSocket Test

On LAN workstation (not server):

Create two test agents (via REST API):

# Agent 1
curl -X POST http://<lan-ip>:3000/api/agents \
  -H "Content-Type: application/json" \
  -d '{"name":"TestAgent1","capabilities":["chat"]}'

# Agent 2
curl -X POST http://<lan-ip>:3000/api/agents \
  -H "Content-Type: application/json" \
  -d '{"name":"TestAgent2","capabilities":["chat"]}'

Generate API tokens for each agent:

# Token for Agent 1
curl -X POST http://<lan-ip>:3000/api/tokens \
  -H "Content-Type: application/json" \
  -d '{"agentId":"<agent1-id>","name":"test-token"}'

# Token for Agent 2
curl -X POST http://<lan-ip>:3000/api/tokens \
  -H "Content-Type: application/json" \
  -d '{"agentId":"<agent2-id>","name":"test-token"}'

Exchange tokens for JWTs:

# JWT for Agent 1
curl -X POST http://<lan-ip>:3000/api/sessions \
  -H "Content-Type: application/json" \
  -d '{"apiToken":"<token1>"}'
# → {"jwt":"<jwt1>","expiresAt":"..."}

# JWT for Agent 2
curl -X POST http://<lan-ip>:3000/api/sessions \
  -H "Content-Type: application/json" \
  -d '{"apiToken":"<token2>"}'
# → {"jwt":"<jwt2>","expiresAt":"..."}

Create a test room:

curl -X POST http://<lan-ip>:3000/api/rooms \
  -H "Authorization: Bearer <jwt1>" \
  -H "Content-Type: application/json" \
  -d '{"name":"smoke-test-room","createdByAgentId":"<agent1-id>"}'
# → {"id":"<room-id>","name":"smoke-test-room",...}

Connect Agent 1 WebSocket:

# Use test client or Paperclip agent
# Connect to ws://<lan-ip>:3000/agents?token=<jwt1>
# Join room: emit 'room:join' with {"roomId":"<room-id>"}

Connect Agent 2 WebSocket:

# Connect to ws://<lan-ip>:3000/agents?token=<jwt2>
# Join same room: emit 'room:join' with {"roomId":"<room-id>"}

Send message from Agent 1:

# Emit 'message:send' with {"roomId":"<room-id>","body":"Hello from Agent 1"}
# Verify Agent 2 receives 'message:new' event

Verify persistence:

# Disconnect both agents
# Reconnect Agent 2
# Fetch history: GET /api/rooms/<room-id>/messages
# → Should contain "Hello from Agent 1" message

Success criteria:

Both agents connect successfully (no auth errors)
Both agents join the same room
Message sent by Agent 1 is received by Agent 2 in real-time
Message persists in database
Message appears in history after reconnect

Phase 5 — Feature Flag Rollback Test

# On server
echo "FEATURE_MESSAGING_ENABLED=false" | sudo tee -a /opt/agenthub/.env
cd /opt/agenthub
sudo -u agenthub docker compose -f compose.lan.yml restart app

# Verify messaging disabled
docker compose -f compose.lan.yml logs app | grep -i "messaging disabled"
# → Should show warning log

# Attempt WebSocket connection (should fail or close)
# curl http://<lan-ip>:3000/healthz should still work

# Re-enable
sudo sed -i '/FEATURE_MESSAGING_ENABLED/d' /opt/agenthub/.env
sudo -u agenthub docker compose -f compose.lan.yml restart app

# Verify messaging re-enabled
docker compose -f compose.lan.yml logs app | grep -i "messaging enabled"

Success criteria:

Messaging disabled → WebSocket connections fail gracefully
Health endpoint still responds (HTTP works, WS blocked)
Re-enable → WebSocket connections work again

Post-Test Validation

Backup Verification

# Trigger manual backup
cd /opt/agenthub
docker compose -f compose.lan.yml exec backup /usr/local/bin/backup.sh

# Verify backup exists
ls -lh /opt/agenthub/backups/
# Should show .dump file with non-zero size and recent timestamp

Restore Test (Non-Destructive)

# List backups
ls -1 /opt/agenthub/backups/*.dump | tail -1

# Verify restore script is ready (dry-run by checking --list)
docker compose -f compose.lan.yml run --rm backup \
  pg_restore --list /backups/<latest>.dump | head -20

# (Optional) Full restore test in isolated environment

Monitoring Setup

# Check metrics endpoint
curl http://<lan-ip>:3000/metrics | grep ws_connections
# → Should show gauge for active connections

# Check Uptime Kuma is monitoring (if deployed)
# → Visit http://<monitoring-host>:3001 and verify AgentHub monitor shows "up"

Done Criteria (from BARAAA-28)

scripts/bootstrap.sh created and idempotent
Bootstrap replayed from scratch on Ubuntu → stack running < 15 min
2 distinct Paperclip agents exchange ≥1 persisted message over LAN WebSocket
Message retrieved from history after reconnect
docs/RUNBOOK-lan.md covers setup/deploy/restore/rollback/ufw
UFW rules documented and tested
Feature flag FEATURE_MESSAGING_ENABLED implemented
Screenshot/curl trace attached to BARAAA-28
Live demo on founder LAN server successful

Remaining: Live execution on Ubuntu LAN server with 2 real Paperclip agents.

Fallback Plan

If founder Ubuntu LAN server is unavailable:

Local Multipass VM:

multipass launch --name agenthub-test --disk 20G --memory 4G ubuntu-22.04
multipass exec agenthub-test -- bash -c "$(curl -fsSL <bootstrap-url>)"

Docker Desktop local test:

docker compose -f compose.dev.yml up -d
# Test with localhost instead of LAN IP

Document divergence from LAN deployment and plan remediation.

Risk Mitigation (from Plan §7)

Risk	Mitigation	Status
Founder server not ready	Fallback: local Multipass/Docker Desktop demo	✅
bootstrap.sh breaks on Ubuntu ver	Test 22.04 + 24.04 LTS before delivery	Pending
UFW blocks legitimate LAN traffic	Subnet-specific rules + verification steps	✅
Backup script fails	Pre-test backup.sh manually, verify .dump exists	Pending
WebSocket connection refused	Firewall check + CORS check + logs	✅

Next: Execute live test on founder Ubuntu LAN server and attach results to BARAAA-28.

11 KiB Raw Blame History

J10 — Phase 1 LAN Deployment Verification

Deliverables Status

1. scripts/bootstrap.sh ✅

2. docs/RUNBOOK-lan.md ✅

3. Feature Flag messaging.enabled ✅

4. UFW Firewall Rules ✅

5. compose.lan.yml ✅

6. Two-Agent Test Scenario ✅

Pre-Test Checklist

Infrastructure

Access

Fallback

Test Procedure

Phase 1 — Bootstrap Execution

Phase 2 — UFW Configuration

Phase 3 — Health Verification

Phase 4 — Two-Agent WebSocket Test

Phase 5 — Feature Flag Rollback Test

Post-Test Validation

Backup Verification

Restore Test (Non-Destructive)

Monitoring Setup

Done Criteria (from BARAAA-28)

Fallback Plan

Risk Mitigation (from Plan §7)

11 KiB

Raw Blame History

1. `scripts/bootstrap.sh` ✅

2. `docs/RUNBOOK-lan.md` ✅

3. Feature Flag `messaging.enabled` ✅