Performing and Managing Incremental Backups Using pg_basebackup in PostgreSQL 17
PostgreSQL 17 adds incremental backups with pg_basebackup and pg_combinebackup, saving time and storage by capturing only changed data since the last backup.
Join the DZone community and get the full member experience.
Join For FreeWith the release of PostgreSQL 17, the introduction of incremental backups using the trusted pg_basebackup
tool marks a significant leap forward in database backup and restoration processes. Traditionally, database backups often involved taking periodic full backups, which, while comprehensive, are time-consuming and storage-intensive.
Incremental backups offer a smarter alternative by capturing and storing only the changes made since the last backup, dramatically reducing both backup size and the time required to create backups.
This innovation is further complemented by the new pg_combinebackup
tool, which simplifies the process of restoring backups. Enabling administrators to seamlessly combine a full backup with a series of incremental backups into a usable state pg_combinebackup
eliminates much of the manual effort involved in restoration. Together, these tools streamline backup management, optimize storage utilization, and accelerate disaster recovery, making them indispensable for modern database environments.
Prerequisites for Incremental Backups
Before initiating incremental backups, certain configurations are necessary:
1. Enable WAL summarization: Incremental backups rely on WAL (Write-Ahead Logs). Ensure the summarize_wal
variable is enabled.
postgres=# show summarize_wal;
summarize_wal
---------------
off
(1 row)
postgres=# alter system set summarize_wal to on;
ALTER SYSTEM
postgres=# select pg_reload_conf();
pg_reload_conf
----------------
t
(1 row)
postgres=# show summarize_wal;
summarize_wal
---------------
on
(1 row)
2. Backup manifest file: The backup_manifest
file, generated during full or incremental backups, tracks changes and serves as input for the next incremental backup.
3. Ensure proper configuration: Verify that WAL archiving and sufficient disk space are configured for backups.
Performing a Full Backup
A full backup is the foundation for incremental backups. Without it, incremental backups cannot be created.
1. Prepare your environment: Perform some operations in your database to simulate activity:
postgres=# create table test(id int, name varchar, updated_at timestamptz);
CREATE TABLE
postgres=# insert into test values (1, 'full backup', now());
INSERT 0 1
2. Execute the full backup:
pg_basebackup -D /data/postgresql/full_backup/ -F tar
-D
: Specifies the backup directory.-F tar
: Saves the backup in tar format.
3. Verify the backup:
ls -lh /data/postgresql/full_backup/
The directory includes essential files such as backup_manifest
, which details the backup and serve as input for incremental backups.
Performing Incremental Backups
Backups
Incremental backups store only changes made since the last backup, reducing backup size and time.
1. First incremental backup: After making changes to the database, create the first incremental backup:
pg_basebackup --incremental=/data/postgresql/full_backup/backup_manifest
-D /data/postgresql/incr_backup1/ -F tar
2. Second incremental backup: Perform more changes in the database, then run:
pg_basebackup --incremental=/data/postgresql/incr_backup1/backup_manifest
-D /data/postgresql/incr_backup2/ -F tar
3. Monitor backup sizes:
du -sh /data/pgsql/incr_backup1/ /data/postgresql/incr_backup2/
Combining Backups for Restoration
PostgreSQL 17 introduces pg_combinebackup
to simplify restoration by merging backups into a usable state.
What Is pg_combinebackup?
- Automated process: Combines full and incremental backups into a complete database directory, eliminating manual intervention.
- Data integrity: Ensures consistency by validating the sequence of backups and checking for missing or corrupted files.
- Simplified restoration: Speeds up disaster recovery by consolidating the restoration steps.
Restoration Process Using pg_combinebackup
1. Prepare the environment: Ensure all backups (full and incremental) are available and stored in the correct sequence. Verify that the backup_manifest
files are intact for each backup.
2. Run the Restoration command: The pg_combinebackup
tool consolidates backups into a target directory for restoration. Here’s how to use it:
3. Restore backups:
pg_combinebackup -o /data/postgresql/restored/
/data/popstgresql/full_backup/
/data/postgresql/incr_backup1/
/data/postgresql/incr_backup2/
Command breakdown:
- -o /data/postgresql/restored/: Specifies the directory where the restored database will be constructed.
- Backup Directories: Provide the paths to the full backup and incremental backups in the correct sequence. Each backup directory must include its respective backup_manifest file.
4. Validate the restoration: Once the above command completes. Check the contents of the restored directory:
ls -lh /data/postgresql/restored/
Ensure all files and directories are present and match the original database structure.
5. Start PostgreSQL: After restoration, initialize the database.
Points to Remember
Restore from Backups
1. Full backup first: Always provide backups in sequential order, starting with the full backup.
pg_combinebackup /data/postgresql/full_backup/
/data/postgresql/incr_backup1/ /data/postgresql/incr_backup2/
2. Backup sequence: Always provide backups in chronological order. If backups are out of order, pg_combinebackup
will throw an error, such as:
pg_combinebackup: error: backup at "/data/postgresql/incr_backup2/"
starts at LSN 0/4000028, but expected 0/2000028
3. Error handling: Skipping backups or providing them out of order results in errors:
pg_combinebackup: error: backup at "incr_backup2/"
starts at LSN 0/6000028, but expected 0/4000028
Key Benefits of Incremental Backups
- Efficiency: Incremental backups reduce storage requirements and backup time by focusing only on changed data.
- Reliability: The
backup_manifest
ensures that every backup is trackable and consistent, reducing the risk of data loss. - Streamlined restoration:
pg_combinebackup
Automates restoration from multiple backups, eliminating manual intervention. - Scalability: Ideal for large databases where full backups are resource-intensive.
Conclusion
PostgreSQL 17’s incremental backup support in pg_basebackup
introduces a significant leap forward in backup management. Coupled with this pg_combinebackup
, it simplifies complex backup and restoration processes, making PostgreSQL more efficient and user-friendly for modern database environments.
Opinions expressed by DZone contributors are their own.
Comments