Fix off-by-one possibly leading to skipped XLOG_RUNNING_XACTS records.
authorAndres Freund <andres@anarazel.de>
Sat, 6 May 2017 23:47:40 +0000 (16:47 -0700)
committerAndres Freund <andres@anarazel.de>
Sat, 6 May 2017 23:55:07 +0000 (16:55 -0700)
Since 6ef2eba3f57f1 ("Skip checkpoints, archiving on idle systems."),
GetLastImportantRecPtr() is used to avoid performing superfluous
checkpoints, xlog switches, running-xact records when the system is
idle.  Unfortunately the check concerning running-xact records had a
off-by-one error, leading to such records being potentially skipped
when only a single record has been inserted since the last
running-xact record.

An alternative approach would have been to change
GetLastImportantRecPtr()'s definition to point to the end of records,
but that would make the checkpoint code more complicated.

Author: Andres Freund
Discussion: https://postgr.es/m/20170505012447.wsrympaxnfis6ojt@alap3.anarazel.de
Backpatch: no, code only present in master

src/backend/postmaster/bgwriter.c
src/backend/postmaster/checkpointer.c

index dcb4cf249c507a41bd54f79dcdfe714255066ba0..48efe15e8253386ff9e7bf714581fd0ab46aa911 100644 (file)
@@ -325,10 +325,13 @@ BackgroundWriterMain(void)
 
            /*
             * Only log if enough time has passed and interesting records have
-            * been inserted since the last snapshot.
+            * been inserted since the last snapshot.  Have to compare with <=
+            * instead of < because GetLastImportantRecPtr() points at the
+            * start of a record, whereas last_snapshot_lsn points just past
+            * the end of the record.
             */
            if (now >= timeout &&
-               last_snapshot_lsn < GetLastImportantRecPtr())
+               last_snapshot_lsn <= GetLastImportantRecPtr())
            {
                last_snapshot_lsn = LogStandbySnapshot();
                last_snapshot_ts = now;
index fe9041f748cdbf7bd9be4b85cad2faa6f2ed7e85..d12db0d5a7bd4b7ed1c746ab52be1a239928507a 100644 (file)
@@ -611,7 +611,8 @@ CheckArchiveTimeout(void)
    {
        /*
         * Switch segment only when "important" WAL has been logged since the
-        * last segment switch.
+        * last segment switch (last_switch_lsn points to end of segment
+        * switch occurred in).
         */
        if (GetLastImportantRecPtr() > last_switch_lsn)
        {