Rollback & Recovery
This guide explains how to revert the Tenant Plane to a previous version after a failed or problematic upgrade. The rollback process restores the container image — database schema changes are not reversed by default.
When to roll back
Roll back when:
- The health endpoint returns errors after upgrading
- Users cannot log in or encounter broken functionality
- An integration stopped working and you suspect the upgrade is the cause
- A critical performance regression is observed after upgrading
How rollback works
The upgrade script records the previous version before every upgrade. To roll back:
- The script restores the
docker-compose.ymlimage tag to the previous version - It restarts the service with the old image
- It polls the health endpoint to confirm the rollback succeeded
- Database schema changes are NOT reversed — the schema stays at the migrated state
The last point is important: if the new version added a database column, that column remains after rollback. The previous version will simply ignore it. This is safe because AtlasAI migrations are purely additive — they never drop or rename columns that the old code expects.
If you need to restore the database to its pre-upgrade state, you must restore from a database backup.
Option 1: Automatic rollback during upgrade
If you used upgrade-tp.sh and the health check failed, the script automatically rolled back for you. Check the upgrade log to confirm:
# View the most recent upgrade log
ls -lt /var/log/atlas/upgrade-*.log | head -1
cat /var/log/atlas/upgrade-<timestamp>.log
# Check upgrade history
cat ./data/.atlas-upgrade-history
# Output example:
# 20260326T100000 1.2.2 -> 1.3.0 SUCCESS log:/var/log/atlas/upgrade-20260326T100000.log
# 20260326T120000 ROLLBACK 1.3.0 -> 1.2.2 log:/var/log/atlas/rollback-20260326T120000.logOption 2: Manual rollback (Docker Compose)
Use this when the upgrade script auto-rollback did not run, or when you decide to roll back after the upgrade initially seemed fine.
Step 1: Identify the previous version
# The upgrade script records this automatically
cat ./data/.atlas-previous-version
# Output: 1.2.2If the file is missing, check the upgrade history:
cat ./data/.atlas-upgrade-historyOr check Docker image history:
docker images atlasai/tenant-plane --format "table {{.Tag}}\t{{.CreatedAt}}" | sortStep 2: Run the rollback script
# Auto-detect previous version
bash scripts/rollback-tp.sh
# Or specify the target version explicitly
ATLAS_VERSION=1.2.2 bash scripts/rollback-tp.shThe rollback script:
- Detects the previous version from
.atlas-previous-version(or usesATLAS_VERSION) - Updates the image tag in
docker-compose.yml - Checks if the old image is available locally (if not, pulls it)
- Restarts the service
- Polls the health endpoint for up to 60 seconds
- Records the rollback in
.atlas-upgrade-history
Step 3: Verify
# Check version
docker compose exec tenant-plane cat /app/.atlas-version
# Should show the previous version
# Check health
curl -s http://localhost:3000/api/healthOption 3: Kubernetes rollback
Kubernetes keeps the previous Deployment revision, making rollback instant.
Instant rollback (< 1 minute)
# Roll back to the immediately previous revision
kubectl rollout undo deployment/atlasai-tp -n atlasai
# Monitor the rollback
kubectl rollout status deployment/atlasai-tp -n atlasai
# Verify
kubectl get pods -n atlasai -l app=atlasai-tpRoll back to a specific revision
# List available revisions
kubectl rollout history deployment/atlasai-tp -n atlasai
# Roll back to revision 3
kubectl rollout undo deployment/atlasai-tp --to-revision=3 -n atlasaiRoll back via Helm
# List Helm release history
helm history atlasai-tp -n atlasai
# Roll back to the previous Helm release
helm rollback atlasai-tp -n atlasai
# Roll back to a specific Helm revision number
helm rollback atlasai-tp 2 -n atlasai --waitRestoring from a database backup
Roll back the database only if:
- The upgrade introduced a data corruption issue
- The new schema changed data in a way that breaks the old code
- You need to fully reproduce the pre-upgrade state for debugging
Warning: restoring the database overwrites all data written after the backup was taken. Any incidents, runbook executions, or configuration changes made after the upgrade will be lost.
Step 1: Stop the Tenant Plane
# Docker Compose
docker compose stop tenant-plane
# Kubernetes
kubectl scale deployment/atlasai-tp --replicas=0 -n atlasaiStep 2: Find your backup
The upgrade script creates a backup before every upgrade:
ls -lh /var/log/atlas/db-backup-*.sql.gz
# Output:
# -rw-r--r-- 1 user user 45M Mar 26 10:00 db-backup-1.2.2-20260326T100000.sql.gzStep 3: Restore
# Using the rollback script with restore flag
RESTORE_DB=1 \
DB_BACKUP_FILE=/var/log/atlas/db-backup-1.2.2-20260326T100000.sql.gz \
DB_URL=postgresql://atlasusr:password@localhost:5432/atlas \
bash scripts/rollback-tp.sh
# Manual restore (alternative)
zcat /var/log/atlas/db-backup-1.2.2-20260326T100000.sql.gz \
| psql postgresql://atlasusr:password@localhost:5432/atlasStep 4: Restart the Tenant Plane
# Docker Compose
docker compose up -d tenant-plane
# Kubernetes
kubectl scale deployment/atlasai-tp --replicas=3 -n atlasaiRecovery checklist
After any rollback:
- Health endpoint returns
"status": "ok" - Correct version is shown in
GET /api/admin/licenseor Settings page - Users can log in
- Edge agents reconnect (check last-seen timestamps in Settings → Edge Agents)
- Review the upgrade log to understand what went wrong before retrying
Getting help
If rollback fails or the system is still not healthy after rollback, contact support@atlasai.com with:
- Your TP version (before and after upgrade attempt)
- The contents of the upgrade log (
/var/log/atlas/upgrade-*.log) - The output of
GET /api/health - Any relevant Docker / Kubernetes logs (
docker compose logs tenant-planeorkubectl logs -n atlasai deploy/atlasai-tp)