Add deploy failure diagnostics and safer backend health check.
Production deploy failed with no backend logs before rollback. Print backend and postgres logs on failure, wait longer for JVM startup, and probe /api/payment/swish-info instead of vehicle lookup (no external scrape). - Document proof-first troubleshooting in README - No volume reset workflow; fix only after reading job logs
This commit is contained in:
parent
d652a5b862
commit
db56fc58de
2 changed files with 31 additions and 4 deletions
|
|
@ -64,12 +64,12 @@ jobs:
|
|||
- name: Health checks with rollback
|
||||
run: |
|
||||
echo "Waiting for services to start..."
|
||||
sleep 20
|
||||
sleep 30
|
||||
|
||||
BACKEND_OK=false
|
||||
for i in 1 2 3 4 5; do
|
||||
for i in 1 2 3 4 5 6 7 8 9 10; do
|
||||
if docker run --rm --network bilhej-prod_default curlimages/curl:8.5.0 \
|
||||
-s http://bilhej-backend-prod:8080/api/vehicles/ABC123 > /dev/null; then
|
||||
-sf http://bilhej-backend-prod:8080/api/payment/swish-info > /dev/null; then
|
||||
echo "Backend is healthy"
|
||||
BACKEND_OK=true
|
||||
break
|
||||
|
|
@ -93,12 +93,25 @@ jobs:
|
|||
if [ "$BACKEND_OK" != "true" ] || [ "$FRONTEND_OK" != "true" ]; then
|
||||
echo ""
|
||||
echo "═══════════════════════════════════════════════════"
|
||||
echo " HEALTH CHECK FAILED — ROLLING BACK DEPLOYMENT"
|
||||
echo " HEALTH CHECK FAILED — DIAGNOSTICS"
|
||||
echo "═══════════════════════════════════════════════════"
|
||||
echo ""
|
||||
docker compose -p bilhej-prod -f docker-compose.prod.yml ps
|
||||
echo ""
|
||||
echo "--- Backend logs ---"
|
||||
docker logs bilhej-backend-prod 2>&1 | tail -80 || true
|
||||
echo ""
|
||||
echo "--- Postgres logs ---"
|
||||
docker logs bilhej-postgres-prod 2>&1 | tail -30 || true
|
||||
echo ""
|
||||
echo "═══════════════════════════════════════════════════"
|
||||
echo " ROLLING BACK DEPLOYMENT"
|
||||
echo "═══════════════════════════════════════════════════"
|
||||
echo ""
|
||||
docker compose -p bilhej-prod -f docker-compose.prod.yml down
|
||||
echo ""
|
||||
echo "Rolled back. Containers stopped. DB volume preserved."
|
||||
echo "Read Backend logs above to find the root cause before redeploying."
|
||||
exit 1
|
||||
fi
|
||||
|
||||
|
|
|
|||
14
README.md
14
README.md
|
|
@ -311,6 +311,20 @@ Before the first deploy, complete these steps on the production server (`srvr.nu
|
|||
3. Enter a version tag (e.g., `v0.1.0`).
|
||||
4. Click **Run workflow**.
|
||||
|
||||
### Deploy failed (backend health check)
|
||||
|
||||
If the job passes the frontend check but the backend never becomes healthy:
|
||||
|
||||
1. Open the failed job log and read **Backend logs** (printed before rollback).
|
||||
2. Match the error to a fix — do not guess:
|
||||
- **`password authentication failed`** — DB credentials in the running stack do not match
|
||||
what Postgres was initialized with; fix credentials or Postgres password to match (only
|
||||
wipe the volume if you accept losing prod data).
|
||||
- **`Production requires ADMIN_EMAIL and ADMIN_PASSWORD`** — add those Forgejo secrets.
|
||||
- **Flyway / migration errors** — fix schema or migration history before redeploying.
|
||||
3. **DBeaver from your laptop** — prod Postgres binds to `127.0.0.1:5433` on the server only.
|
||||
Use an SSH tunnel, then host `localhost` port `5433` (not `192.168.0.59` directly).
|
||||
|
||||
### What Happens
|
||||
|
||||
| Step | Action |
|
||||
|
|
|
|||
Loading…
Reference in a new issue