In continuation of our blog series on effectively implementing backup strategies in a production environment, this article delves into the topic of Seamless Backup Strategies in PostgreSQL: Native Tools That Work. We present the following options for taking backups, exploring each in detail here, while additional strategies will be covered in dedicated posts in upcoming blogs.
Backup in Postgresql:
A backup is a reliable and secure copy of a database cluster that serves multiple purposes, such as establishing a new standby, performing Point-In-Time Recovery (PITR), or recovering from hardware failures. In PostgreSQL, backups can be executed using various strategies to optimize storage utilization, recovery speed, and overall performance.
PostgreSQL employs Write-Ahead Logging (WAL) extensively to enable incremental-like backups. Tools such as pg_basebackup (for physical backups) and pg_dump/pg_dumpall (for logical backups) play a significant role in the backup process. PITR is achieved by combining a full backup with WAL files, effectively supporting incremental recovery.
Backup Option in Postgres for Integrated Choices:
Full Backup
A full backup is a complete copy of the entire PostgreSQL database cluster (including all databases, tables, indexes, and transaction logs).
- It serves as the base for differential and incremental backups.
- Recovery from a full backup alone restores the database to the exact state at the time of the backup.
Example: pg_dumpall or pg_basebackup for physical backups.
Differential Backup
A differential backup captures only the changes made since the last full backup.
- It does not depend on previous differential backups, only on the last full backup.
- Over time, differential backups grow in size but simplify recovery (only the full backup + latest differential backup are needed).
Example: pg_basebackup only takes full backups, but WAL archiving allows differential-like recovery.
Incremental Backup
An incremental backup stores only the changes made since the last backup of any kind (full, differential, or incremental).
- This results in smaller backup sizes but requires a chain of backups for recovery.
- Recovery involves applying all incremental backups in sequence since the last full or differential backup.
Example: Using WAL archiving and replaying logs (pg_wal files) from a specific point in time.
Integrated Choices:
The two initiatives that come with PostgreSQL serve as a base for the integrated choices.
- pgdump / pgdumpall (logical)
- Pg_basebackup (physical)
Starting with the Integrated choices we have two options (pgdump / pgdumpall) and pg_basebackup respectively.
pgdump / pgdumpall (logical):
pgdump works on a single database, while pg_dumpall backs up all databases within a PostgreSQL cluster.Restoring requires re-executing the SQL scripts, which is slower but allows fine-grained control over the data.
pg_dump dumps a database as a text file or to other formats.
For this example we are using sample database dvdrental.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 | [postgres@mysql-02 ~]$ pg_dump -U postgres -p 5432 -d dvdrental -f /var/lib/pgsql/dvdrental_t1.sql pg_dumpall extracts a PostgreSQL database cluster into an SQL script file. Usage: pg_dumpall [OPTION]... General options: -f, --file=FILENAME output file name -v, --verbose verbose mode -V, --version output version information, then exit --lock-wait-timeout=TIMEOUT fail after waiting TIMEOUT for a table lock -?, --help show this help, then exit Options controlling the output content: -a, --data-only dump only the data, not the schema -c, --clean clean (drop) databases before recreating -E, --encoding=ENCODING dump the data in encoding ENCODING -g, --globals-only dump only global objects, no databases -O, --no-owner skip restoration of object ownership -r, --roles-only dump only roles, no databases or tablespaces -s, --schema-only dump only the schema, no data -S, --superuser=NAME superuser user name to use in the dump -t, --tablespaces-only dump only tablespaces, no databases or roles -x, --no-privileges do not dump privileges (grant/revoke) --binary-upgrade for use by upgrade utilities only --column-inserts dump data as INSERT commands with column names --disable-dollar-quoting disable dollar quoting, use SQL standard quoting --disable-triggers disable triggers during data-only restore --exclude-database=PATTERN exclude databases whose name matches PATTERN --extra-float-digits=NUM override default setting for extra_float_digits --if-exists use IF EXISTS when dropping objects --inserts dump data as INSERT commands, rather than COPY --load-via-partition-root load partitions via the root table --no-comments do not dump comments --no-publications do not dump publications --no-role-passwords do not dump passwords for roles --no-security-labels do not dump security label assignments --no-subscriptions do not dump subscriptions --no-sync do not wait for changes to be written safely to disk --no-table-access-method do not dump table access methods --no-tablespaces do not dump tablespace assignments --no-toast-compression do not dump TOAST compression methods --no-unlogged-table-data do not dump unlogged table data --on-conflict-do-nothing add ON CONFLICT DO NOTHING to INSERT commands --quote-all-identifiers quote all identifiers, even if not key words --rows-per-insert=NROWS number of rows per INSERT; implies --inserts --use-set-session-authorization use SET SESSION AUTHORIZATION commands instead of ALTER OWNER commands to set ownership Connection options: -d, --dbname=CONNSTR connect using connection string -h, --host=HOSTNAME database server host or socket directory -l, --database=DBNAME alternative default database -p, --port=PORT database server port number -U, --username=NAME connect as specified database user -w, --no-password never prompt for password -W, --password force password prompt (should happen automatically) --role=ROLENAME do SET ROLE before dump If -f/--file is not used, then the SQL script will be written to the standard output. [postgres@mysql-02 ~]$ pg_dumpall -U postgres -p 5432 -f /var/lib/pgsql/dvdrental_t2.sql |
To restore the backups use the psql to restore the sql format backups (by default pg_dumpall supports only plain SQL format).
For Single database Restore:
1 | psql -U postgres -p 5432 -d dvdrental -f /var/lib/pgsql/dvdrental_t1.sql |
For Restoring all the databases, roles and globals:
1 | psql -U postgres -f /var/lib/pgsql/dvdrental_t2.sql |
Pg_basebackup (physical):
A physical backup tool that creates a binary copy of the entire PostgreSQL database cluster, including configuration files.Used for setting up replication or disaster recovery since it captures everything in its current state.Restoring is faster, as it simply requires copying files, but it lacks the flexibility of logical backups.
Limitations: In any mode, pg_basebackup ensures the server transitions in and out of backup mode seamlessly. It exclusively performs full backups of the entire database cluster, as backing up specific databases or objects is not supported. For more selective backups, tools like pg_dump should be utilized.
Let’s dive into implementation of pg_basebackup for fullbackup
pg_basebackup takes a base backup of a running PostgreSQL server.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 | Usage: pg_basebackup [OPTION]... Options controlling the output: -D, --pgdata=DIRECTORY receive base backup into directory -F, --format=p|t output format (plain (default), tar) -r, --max-rate=RATE maximum transfer rate to transfer data directory (in kB/s, or use suffix "k" or "M") -R, --write-recovery-conf write configuration for replication -t, --target=TARGET[:DETAIL] backup target (if other than client) -T, --tablespace-mapping=OLDDIR=NEWDIR relocate tablespace in OLDDIR to NEWDIR --waldir=WALDIR location for the write-ahead log directory -X, --wal-method=none|fetch|stream include required WAL files with specified method -z, --gzip compress tar output -Z, --compress=[{client|server}-]METHOD[:DETAIL] compress on client or server as specified -Z, --compress=none do not compress tar output General options: -c, --checkpoint=fast|spread set fast or spread checkpointing -C, --create-slot create replication slot -l, --label=LABEL set backup label -n, --no-clean do not clean up after errors -N, --no-sync do not wait for changes to be written safely to disk -P, --progress show progress information -S, --slot=SLOTNAME replication slot to use -v, --verbose output verbose messages -V, --version output version information, then exit --manifest-checksums=SHA{224,256,384,512}|CRC32C|NONE use algorithm for manifest checksums --manifest-force-encode hex encode all file names in manifest --no-estimate-size do not estimate backup size in server side --no-manifest suppress generation of backup manifest --no-slot prevent creation of temporary replication slot --no-verify-checksums do not verify checksums -?, --help show this help, then exit Connection options: -d, --dbname=CONNSTR connection string -h, --host=HOSTNAME database server host or socket directory -p, --port=PORT database server port number -s, --status-interval=INTERVAL time between status packets sent to server (in seconds) -U, --username=NAME connect as specified database user -w, --no-password never prompt for password -W, --password force password prompt (should happen automatically) |
Command for taking fullbackup using pg_basebackup:
1 2 | [postgres@mysql-02 ~]$ pg_basebackup -U postgres -p 5432 -D /var/lib/pgsql/dvdrental_backup --progress --format=tar --gzip 53632/53632 kB (100%), 1/1 tablespace |
Explanation:
-h localhost: Specifies the host; replace
localhost
with the appropriate hostname or IP address.
-U postgres: Uses the
postgres
user for authentication.
-p 5432: Connects to the PostgreSQL server on port 5432.
-D /var/lib/pgsql/dvdrental_backup: Defines the destination directory where the backup will be stored
–progress: Shows real-time progress of the backup.
–format=tar: Saves the backup in tar format.
–gzip: Compresses the tar file to save space.
Restoring from fullbackup:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | [postgres@mysql-02 ~]$ sudo tar -xvzf /var/lib/pgsql/dvdrental_backup/base.tar.gz -C /var/lib/pgsql/extracted_backup [postgres@mysql-02 ~]$ sudo systemctl stop postgresql-16 [postgres@mysql-02 ~]$ sudo cp -r /var/lib/pgsql/extracted_backup /var/lib/pgsql/16/data [postgres@mysql-02 ~]$ sudo chown -R postgres:postgres /var/lib/pgsql/16/data [postgres@mysql-02 ~]$ sudo chmod 700 /var/lib/pgsql/16/data [postgres@mysql-02 ~]$ touch /var/lib/pgsql/16/data/recovery.signal [postgres@mysql-02 ~]$ sudo systemctl start postgresql-16 [postgres@mysql-02 ~]$ sudo systemctl status postgresql-16 ● postgresql-16.service - PostgreSQL 16 database server Loaded: loaded (/usr/lib/systemd/system/postgresql-16.service; enabled; vendor preset: disabled) Active: active (running) since Wed 2025-04-16 16:07:09 EDT; 18s ago Docs: https://www.postgresql.org/docs/16/static/ Process: 6535 ExecStartPre=/usr/pgsql-16/bin/postgresql-16-check-db-dir ${PGDATA} (code=exited, status=0/SUCCESS) Main PID: 6540 (postgres) Tasks: 8 (limit: 10938) Memory: 59.6M CGroup: /system.slice/postgresql-16.service |
We will be covering the latest feature of incremental backups using the pg_basebackup in our next blog.