Handle new HOT chains in index-build table scans
authorAlvaro Herrera <alvherre@alvh.no-ip.org>
Thu, 13 Aug 2020 21:33:49 +0000 (17:33 -0400)
committerAlvaro Herrera <alvherre@alvh.no-ip.org>
Thu, 13 Aug 2020 21:33:49 +0000 (17:33 -0400)
When a table is scanned by heapam_index_build_range_scan (née
IndexBuildHeapScan) and the table lock being held allows concurrent data
changes, it is possible for new HOT chains to sprout in a page that were
unknown when the scan of a page happened.  This leads to an error such
as
  ERROR:  failed to find parent tuple for heap-only tuple at (X,Y) in table "tbl"
because the root tuple was not present when we first obtained the list
of the page's root tuples.  This can be fixed by re-obtaining the list
of root tuples, if we see that a heap-only tuple appears to point to a
non-existing root.

This was reported by Anastasia as occurring for BRIN summarization
(which exists since 9.5), but I think it could theoretically also happen
with CREATE INDEX CONCURRENTLY (much older) or REINDEX CONCURRENTLY
(very recent).  It seems a happy coincidence that BRIN forces us to
backpatch this all the way to 9.5.

Reported-by: Anastasia Lubennikova <a.lubennikova@postgrespro.ru>
Diagnosed-by: Anastasia Lubennikova <a.lubennikova@postgrespro.ru>
Co-authored-by: Anastasia Lubennikova <a.lubennikova@postgrespro.ru>
Co-authored-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/602d8487-f0b2-5486-0088-0f372b2549fa@postgrespro.ru
Backpatch: 9.5 - master

src/backend/access/heap/pruneheap.c
src/backend/catalog/index.c

index 6ff92516eda9e44f82f0be3ff08a90a2e5df6867..62e5b96be9daf84af2e94cb4c9e4ed6992e44bf9 100644 (file)
@@ -731,7 +731,7 @@ heap_page_prune_execute(Buffer buffer,
  * root_offsets[k - 1] = j.
  *
  * The passed-in root_offsets array must have MaxHeapTuplesPerPage entries.
- * We zero out all unused entries.
+ * Unused entries are filled with InvalidOffsetNumber (zero).
  *
  * The function must be called with at least share lock on the buffer, to
  * prevent concurrent prune operations.
@@ -746,7 +746,8 @@ heap_get_root_tuples(Page page, OffsetNumber *root_offsets)
    OffsetNumber offnum,
                maxoff;
 
-   MemSet(root_offsets, 0, MaxHeapTuplesPerPage * sizeof(OffsetNumber));
+   MemSet(root_offsets, InvalidOffsetNumber,
+          MaxHeapTuplesPerPage * sizeof(OffsetNumber));
 
    maxoff = PageGetMaxOffsetNumber(page);
    for (offnum = FirstOffsetNumber; offnum <= maxoff; offnum = OffsetNumberNext(offnum))
index f5c12d3d1c9e089768f74291442f3cf1c536a7fc..f1fe529981c0e19cc14f2802720cd28748bbe33e 100644 (file)
@@ -2389,6 +2389,12 @@ IndexBuildHeapRangeScan(Relation heapRelation,
         * buffer continuously while visiting the page, so no pruning
         * operation can occur either.
         *
+        * In cases with only ShareUpdateExclusiveLock on the table, it's
+        * possible for some HOT tuples to appear that we didn't know about
+        * when we first read the page.  To handle that case, we re-obtain the
+        * list of root offsets when a HOT tuple points to a root item that we
+        * don't know about.
+        *
         * Also, although our opinions about tuple liveness could change while
         * we scan the page (due to concurrent transaction commits/aborts),
         * the chain root locations won't, so this info doesn't need to be
@@ -2659,6 +2665,20 @@ IndexBuildHeapRangeScan(Relation heapRelation,
            rootTuple = *heapTuple;
            offnum = ItemPointerGetOffsetNumber(&heapTuple->t_self);
 
+           /*
+            * If a HOT tuple points to a root that we don't know
+            * about, obtain root items afresh.  If that still fails,
+            * report it as corruption.
+            */
+           if (root_offsets[offnum - 1] == InvalidOffsetNumber)
+           {
+               Page    page = BufferGetPage(scan->rs_cbuf);
+
+               LockBuffer(scan->rs_cbuf, BUFFER_LOCK_SHARE);
+               heap_get_root_tuples(page, root_offsets);
+               LockBuffer(scan->rs_cbuf, BUFFER_LOCK_UNLOCK);
+           }
+
            if (!OffsetNumberIsValid(root_offsets[offnum - 1]))
                ereport(ERROR,
                        (errcode(ERRCODE_DATA_CORRUPTED),