agenthub/docs/BARAAA-70-VERIFICATION.md
Paperclip FoundingEngineer 167b30a409 docs(ofelia): Add BARAAA-70 verification - Ofelia restart loop resolved
BARAAA-70 is now resolved. The Ofelia container restart loop has been fixed
by relocating job labels from the ephemeral backup container to the persistent
postgres container.

Root cause: Ofelia labels were on backup service with restart: 'no', so the
container would exit immediately. Ofelia only scans running containers, found
no jobs, and crashed with "unable to start a empty scheduler".

Fixes applied:
- Fixed /opt/agenthub/backups permissions (chmod 777)
- Moved Ofelia job labels to postgres service
- Fixed YAML syntax errors in compose.lan.yml

Verification: Ofelia now running stably with 0 restarts, backup-daily job
registered with schedule '0 0 3 * * *'.

Next: Monitor backup execution at 3am UTC on 2026-05-03.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-05-02 22:07:04 +00:00

3.9 KiB

BARAAA-70: Ofelia Container Restart Loop - RESOLVED

Date: 2026-05-02
Server: 192.168.9.23 (LAN)
Status: DONE

Problem

agenthub-ofelia-1 container was in continuous restart loop with error:

unable to start a empty scheduler

Ofelia scheduler was unable to find any scheduled jobs and crashed immediately.

Root Cause Chain

  1. Backup container crash → Permission denied writing to /backups/ directory
  2. Backup container exits → Has restart: 'no' policy, container stops after running
  3. Ofelia finds no jobs → Was looking for labels on backup container, but container not running
  4. Ofelia crashes → Cannot start with empty scheduler (no jobs found)
  5. Restart loop → Docker restarts Ofelia, cycle repeats

Fixes Applied

1. Fixed Backup Permissions

sudo mkdir -p /opt/agenthub/backups
sudo chmod 777 /opt/agenthub/backups

2. Relocated Ofelia Labels

Problem: Labels were on backup service which has restart: 'no' and exits after running.

Solution: Moved labels to postgres service which runs continuously.

Modified: /opt/agenthub/compose.lan.yml

postgres:
  image: postgres:16-alpine
  environment:
    POSTGRES_DB: agenthub
    POSTGRES_USER: agenthub
    POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
  volumes:
    - pgdata:/var/lib/postgresql/data
  labels:
    ofelia.enabled: 'true'
    ofelia.job-exec.backup-daily.schedule: '0 0 3 * * *'
    ofelia.job-exec.backup-daily.container: 'agenthub-backup-1'
    ofelia.job-exec.backup-daily.command: '/usr/local/bin/backup.sh'
  restart: unless-stopped

3. Fixed YAML Syntax Issues

Multiple YAML syntax errors were introduced during manual editing:

  • Incorrect indentation causing services.restart must be a mapping
  • Empty labels: line in backup section
  • Redis command in flow style instead of block style

All fixed via SSH access using programmatic file editing.

4. Restarted Services

docker compose -f compose.lan.yml up -d postgres
docker compose -f compose.lan.yml restart ofelia

Verification Results

Container Status

docker compose -f compose.lan.yml ps ofelia

Result: Container shows "Up" status (not "Restarting")

Restart Count

docker inspect agenthub-ofelia-1 --format '{{.State.Status}} - Restarts: {{.RestartCount}}'

Result: running - Restarts: 0

Ofelia Logs

docker logs agenthub-ofelia-1 --tail 20

Result:

New job registered 'backup-daily' - '/usr/local/bin/backup.sh' - '0 0 3 * * *'
Starting scheduler with 1 jobs

Job successfully registered and scheduler started

Uptime Stability

Container maintained stable "Up" state for 27+ seconds after restart with zero restarts.

Acceptance Criteria Met

  • Ofelia container in "Up" state (not "Restarting")
  • Scheduler starts successfully with registered job
  • Zero restart count after fix applied
  • Backup job registered with correct schedule (3am UTC daily)

Next Verification

Monitor backup-daily job execution at 03:00 UTC on 2026-05-03 to confirm scheduled task runs successfully.

Expected: /opt/agenthub/backups/ should contain new dump file after 3am execution.

Files Modified

  • /opt/agenthub/compose.lan.yml - Added Ofelia labels to postgres service, fixed YAML syntax
  • /opt/agenthub/backups/ - Created directory with correct permissions (777)

Technical Notes

Ofelia Job Discovery: Ofelia scans running containers for labels. Jobs on containers with restart: 'no' that exit immediately are not discoverable.

Solution Pattern: For job-exec mode, place Ofelia labels on a service that runs continuously (like postgres, redis) rather than on the ephemeral service being executed.

Alternative: Use job-run mode instead of job-exec if you need to schedule one-shot containers, but this wasn't necessary here since backup.sh already existed in the backup service.