Fix cleanup lock acquisition in SPLIT_ALLOCATE_PAGE replay.
authorAmit Kapila <akapila@postgresql.org>
Mon, 14 Nov 2022 05:13:33 +0000 (10:43 +0530)
committerAmit Kapila <akapila@postgresql.org>
Mon, 14 Nov 2022 05:13:33 +0000 (10:43 +0530)
During XLOG_HASH_SPLIT_ALLOCATE_PAGE replay, we were checking for a
cleanup lock on the new bucket page after acquiring an exclusive lock on
it and raising a PANIC error on failure. However, it is quite possible
that checkpointer can acquire the pin on the same page before acquiring a
lock on it, and then the replay will lead to an error. So instead, directly
acquire the cleanup lock on the new bucket page during
XLOG_HASH_SPLIT_ALLOCATE_PAGE replay operation.

Reported-by: Andres Freund
Author: Robert Haas
Reviewed-By: Amit Kapila, Andres Freund, Vignesh C
Backpatch-through: 11
Discussion: https://postgr.es/m/20220810022617.fvjkjiauaykwrbse@awork3.anarazel.de

src/backend/access/hash/hash_xlog.c
src/backend/access/hash/hashpage.c

index e88213c74253a8533b462620375feebbcef00d03..a24a1c39081785113e0308f937907bdd42a0abb8 100644 (file)
@@ -351,11 +351,10 @@ hash_xlog_split_allocate_page(XLogReaderState *record)
    }
 
    /* replay the record for new bucket */
-   newbuf = XLogInitBufferForRedo(record, 1);
+   XLogReadBufferForRedoExtended(record, 1, RBM_ZERO_AND_CLEANUP_LOCK, true,
+                                 &newbuf);
    _hash_initbuf(newbuf, xlrec->new_bucket, xlrec->new_bucket,
                  xlrec->new_bucket_flag, true);
-   if (!IsBufferCleanupOK(newbuf))
-       elog(PANIC, "hash_xlog_split_allocate_page: failed to acquire cleanup lock");
    MarkBufferDirty(newbuf);
    PageSetLSN(BufferGetPage(newbuf), lsn);
 
index d2edcd46172fb4af4485cf9853f02726ad03c403..55b2929ad518733b0d2a7c5eb4606d00e292c103 100644 (file)
@@ -805,9 +805,13 @@ restart_expand:
    /*
     * Physically allocate the new bucket's primary page.  We want to do this
     * before changing the metapage's mapping info, in case we can't get the
-    * disk space.  Ideally, we don't need to check for cleanup lock on new
-    * bucket as no other backend could find this bucket unless meta page is
-    * updated.  However, it is good to be consistent with old bucket locking.
+    * disk space.
+    *
+    * XXX It doesn't make sense to call _hash_getnewbuf first, zeroing the
+    * buffer, and then only afterwards check whether we have a cleanup lock.
+    * However, since no scan can be accessing the buffer yet, any concurrent
+    * accesses will just be from processes like the bgwriter or checkpointer
+    * which don't care about its contents, so it doesn't really matter.
     */
    buf_nblkno = _hash_getnewbuf(rel, start_nblkno, MAIN_FORKNUM);
    if (!IsBufferCleanupOK(buf_nblkno))