Fix cleanup lock acquisition in SPLIT_ALLOCATE_PAGE replay.
authorAmit Kapila <akapila@postgresql.org>
Mon, 14 Nov 2022 05:13:33 +0000 (10:43 +0530)
committerAmit Kapila <akapila@postgresql.org>
Mon, 14 Nov 2022 05:13:33 +0000 (10:43 +0530)
During XLOG_HASH_SPLIT_ALLOCATE_PAGE replay, we were checking for a
cleanup lock on the new bucket page after acquiring an exclusive lock on
it and raising a PANIC error on failure. However, it is quite possible
that checkpointer can acquire the pin on the same page before acquiring a
lock on it, and then the replay will lead to an error. So instead, directly
acquire the cleanup lock on the new bucket page during
XLOG_HASH_SPLIT_ALLOCATE_PAGE replay operation.

Reported-by: Andres Freund
Author: Robert Haas
Reviewed-By: Amit Kapila, Andres Freund, Vignesh C
Backpatch-through: 11
Discussion: https://postgr.es/m/20220810022617.fvjkjiauaykwrbse@awork3.anarazel.de

src/backend/access/hash/hash_xlog.c
src/backend/access/hash/hashpage.c

index e88213c74253a8533b462620375feebbcef00d03..a24a1c39081785113e0308f937907bdd42a0abb8 100644 (file)
@@ -351,11 +351,10 @@ hash_xlog_split_allocate_page(XLogReaderState *record)
        }
 
        /* replay the record for new bucket */
-       newbuf = XLogInitBufferForRedo(record, 1);
+       XLogReadBufferForRedoExtended(record, 1, RBM_ZERO_AND_CLEANUP_LOCK, true,
+                                                                 &newbuf);
        _hash_initbuf(newbuf, xlrec->new_bucket, xlrec->new_bucket,
                                  xlrec->new_bucket_flag, true);
-       if (!IsBufferCleanupOK(newbuf))
-               elog(PANIC, "hash_xlog_split_allocate_page: failed to acquire cleanup lock");
        MarkBufferDirty(newbuf);
        PageSetLSN(BufferGetPage(newbuf), lsn);
 
index d2edcd46172fb4af4485cf9853f02726ad03c403..55b2929ad518733b0d2a7c5eb4606d00e292c103 100644 (file)
@@ -805,9 +805,13 @@ restart_expand:
        /*
         * Physically allocate the new bucket's primary page.  We want to do this
         * before changing the metapage's mapping info, in case we can't get the
-        * disk space.  Ideally, we don't need to check for cleanup lock on new
-        * bucket as no other backend could find this bucket unless meta page is
-        * updated.  However, it is good to be consistent with old bucket locking.
+        * disk space.
+        *
+        * XXX It doesn't make sense to call _hash_getnewbuf first, zeroing the
+        * buffer, and then only afterwards check whether we have a cleanup lock.
+        * However, since no scan can be accessing the buffer yet, any concurrent
+        * accesses will just be from processes like the bgwriter or checkpointer
+        * which don't care about its contents, so it doesn't really matter.
         */
        buf_nblkno = _hash_getnewbuf(rel, start_nblkno, MAIN_FORKNUM);
        if (!IsBufferCleanupOK(buf_nblkno))