Project: Automated Database Backups
A backup is only useful if it runs by itself, lives somewhere safe, and actually restores when you need it. In this project you will build a complete backup system for a PostgreSQL database on Ubuntu: a script that dumps the database, compresses it, removes old copies (rotation), and ships a copy offsite to Amazon S3 or another server with rsync. Then you will schedule it and, most importantly, test a real restore. This is the reliability capstone of the DevOps projects — if your server’s disk dies tonight, this is what saves you.
The single most common backup mistake is never testing a restore. A backup you have not restored is not a backup — it is a hope. We test the restore in this project, not as an afterthought.
What we are building
The full pipeline has five stages. Each one is a small, well-understood piece.
| Stage | What it does | Tool |
|---|---|---|
| Dump | Export the database to a file | pg_dump |
| Compress | Shrink the file to save space and bandwidth | gzip |
| Rotate | Delete backups older than N days | find |
| Offsite | Copy a backup off the server | aws s3 or rsync |
| Restore (test) | Rebuild the database from a backup | pg_restore / psql |
This project assumes you already have PostgreSQL installed. If not, see the database install pages in this section first. The same pattern works for MySQL — swap pg_dump for mysqldump and pg_restore for mysql.
Step 1 — Create a backup user and directory
Run backups as a dedicated, low-privilege OS user so a leaked backup key cannot touch the rest of the server. Here we keep it simple and run as postgres, the user PostgreSQL already created.
First make a place to store backups, owned by postgres.
sudo mkdir -p /var/backups/postgres
sudo chown postgres:postgres /var/backups/postgres
sudo chmod 700 /var/backups/postgres
Output:
(no output on success)
chmod 700 means only the owner can read the directory — important, because dumps contain all your data.
Step 2 — Write the backup script
Create the script at /usr/local/bin/pg-backup.sh. It dumps one database, compresses it, copies it offsite, and deletes old local copies.
sudo nano /usr/local/bin/pg-backup.sh
Paste this:
#!/usr/bin/env bash
set -euo pipefail
# --- Config -------------------------------------------------
DB_NAME="appdb"
BACKUP_DIR="/var/backups/postgres"
RETENTION_DAYS=7
S3_BUCKET="s3://my-company-db-backups/postgres"
# ------------------------------------------------------------
TIMESTAMP="$(date +%Y-%m-%d_%H-%M-%S)"
FILE="${BACKUP_DIR}/${DB_NAME}_${TIMESTAMP}.dump.gz"
# 1. Dump in custom format (-Fc) and compress in one stream.
pg_dump -Fc "$DB_NAME" | gzip > "$FILE"
# 2. Verify the file is not empty.
if [ ! -s "$FILE" ]; then
echo "ERROR: backup file is empty" >&2
exit 1
fi
# 3. Copy offsite to S3.
aws s3 cp "$FILE" "${S3_BUCKET}/" --storage-class STANDARD_IA
# 4. Rotate: delete local backups older than RETENTION_DAYS.
find "$BACKUP_DIR" -name "${DB_NAME}_*.dump.gz" -mtime +"$RETENTION_DAYS" -delete
echo "Backup complete: $FILE"
set -euo pipefail makes the script stop immediately on any error instead of silently continuing — essential for backups. -Fc produces PostgreSQL’s custom format, which pg_restore can restore selectively and in parallel.
Make it executable:
sudo chmod 750 /usr/local/bin/pg-backup.sh
sudo chown postgres:postgres /usr/local/bin/pg-backup.sh
Offsite option B — rsync instead of S3
If you prefer to copy to another server you control, replace the aws s3 cp line with rsync over SSH (a secure copy that only transfers changes):
rsync -az -e "ssh -i /home/postgres/.ssh/backup_key" \
"$FILE" [email protected]:/srv/db-backups/
| Offsite target | When to use it | When NOT to use it |
|---|---|---|
| Amazon S3 | You want cheap, durable, hands-off storage; no second server to maintain | You are fully offline / air-gapped, or cannot use a cloud provider |
| rsync to a host | You already own a second machine in another location | You only have one server — that is not “offsite” |
Step 3 — Test the script by hand first
Never schedule something you have not run manually. Run it as postgres:
sudo -u postgres /usr/local/bin/pg-backup.sh
Output:
upload: ../var/backups/postgres/appdb_2026-06-15_02-00-01.dump.gz to s3://my-company-db-backups/postgres/appdb_2026-06-15_02-00-01.dump.gz
Backup complete: /var/backups/postgres/appdb_2026-06-15_02-00-01.dump.gz
If S3 fails, run aws configure as the postgres user and set the access key, secret, and region first.
Step 4 — Schedule it with a systemd timer
You can use cron, but a systemd timer (the modern Ubuntu scheduler) gives you logs in journalctl, automatic catch-up if the server was off, and easy status checks. We will use a timer.
Create the service unit at /etc/systemd/system/pg-backup.service:
[Unit]
Description=Nightly PostgreSQL backup
[Service]
Type=oneshot
User=postgres
ExecStart=/usr/local/bin/pg-backup.sh
Create the timer at /etc/systemd/system/pg-backup.timer:
[Unit]
Description=Run PostgreSQL backup every night at 02:00
[Timer]
OnCalendar=*-*-* 02:00:00
Persistent=true
[Install]
WantedBy=timers.target
Persistent=true means if the machine was off at 02:00, the backup runs as soon as it boots. Enable and start the timer:
sudo systemctl daemon-reload
sudo systemctl enable --now pg-backup.timer
sudo systemctl list-timers pg-backup.timer
Output:
NEXT LEFT LAST PASSED UNIT ACTIVATES
Tue 2026-06-16 02:00:00 UTC 21h left - - pg-backup.timer pg-backup.service
To see the logs from the last run:
journalctl -u pg-backup.service --since today
Cron alternative
If you prefer cron, edit the postgres user’s crontab (sudo -u postgres crontab -e) and add:
0 2 * * * /usr/local/bin/pg-backup.sh >> /var/log/pg-backup.log 2>&1
Step 5 — Test a full restore
This is the part everyone skips. Restore the backup into a throwaway database and confirm the data is there. First download a backup from S3 (proving the offsite copy works):
aws s3 cp s3://my-company-db-backups/postgres/appdb_2026-06-15_02-00-01.dump.gz /tmp/
gunzip /tmp/appdb_2026-06-15_02-00-01.dump.gz
Create an empty test database and restore into it:
sudo -u postgres createdb appdb_restore_test
sudo -u postgres pg_restore -d appdb_restore_test /tmp/appdb_2026-06-15_02-00-01.dump
Now verify a known table has rows:
sudo -u postgres psql -d appdb_restore_test -c "SELECT count(*) FROM users;"
Output:
count
-------
1428
(1 row)
If the count looks right, the restore works. Drop the test database to clean up:
sudo -u postgres dropdb appdb_restore_test
Put a restore test on your calendar every quarter. Schema changes, growing data, or a moved S3 bucket can silently break restores. The only way to know it still works is to do it.
Best practices
- Test a restore on a schedule, not just once — a backup proven months ago can break silently.
- Keep at least one copy offsite and one local, so you survive both disk failure and accidental deletion.
- Encrypt backups at rest: enable S3 bucket encryption, or pipe through
gpgbefore upload, since dumps contain all your data. - Monitor that backups actually ran — alert if
pg-backup.servicefails or if no new object lands in S3 for 24 hours. - Set a sensible retention window (7 daily, 4 weekly, 12 monthly is a common pattern) instead of keeping everything forever.
- Use S3 lifecycle rules or
STANDARD_IA/Glacier storage classes to cut cost on older backups automatically. - Never store backup credentials in the script in plain text on a shared machine — use
aws configureprofiles or an SSH key with restricted access.