Add page-level freezing to VACUUM.

Teach VACUUM to decide on whether or not to trigger freezing at the level of whole heap pages. Individual XIDs and MXIDs fields from tuple headers now trigger freezing of whole pages, rather than independently triggering freezing of each individual tuple header field. Managing the cost of freezing over time now significantly influences when and how VACUUM freezes. The overall amount of WAL written is the single most important freezing related cost, in general. Freezing each page's tuples together in batch allows VACUUM to take full advantage of the freeze plan WAL deduplication optimization added by commit 9e540599. Also teach VACUUM to trigger page-level freezing whenever it detects that heap pruning generated an FPI. We'll have already written a large amount of WAL just to do that much, so it's very likely a good idea to get freezing out of the way for the page early. This only happens in cases where it will directly lead to marking the page all-frozen in the visibility map. In most cases "freezing a page" removes all XIDs < OldestXmin, and all MXIDs < OldestMxact. It doesn't quite work that way in certain rare cases involving MultiXacts, though. It is convenient to define "freeze the page" in a way that gives FreezeMultiXactId the leeway to put off the work of processing an individual tuple's xmax whenever it happens to be a MultiXactId that would require an expensive second pass to process aggressively (allocating a new multi is especially worth avoiding here). FreezeMultiXactId is eager when processing is cheap (as it usually is), and lazy in the event of an individual multi that happens to require expensive second pass processing. This avoids regressions related to processing of multis that page-level freezing might otherwise cause. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Jeff Davis <pgsql@j-davis.com> Reviewed-By: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CAH2-WzkFok_6EAHuK39GaW4FjEFQsY=3J0AAd6FXk93u-Xq3Fg@mail.gmail.com
author: Peter Geoghegan 2022-12-28 16:50:47 +0000
committer: Peter Geoghegan 2022-12-28 16:50:47 +0000
commit: 1de58df4fec7325d91f5a8345757314be7ac05da (patch)
tree: 2f9012b64175aaffb82101aa5bd424ced1e13a26 /src/include
parent: 7a05425d96742acff5ebfacf307711385c88429b (diff)
1 files changed, 83 insertions, 8 deletions
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 53eb011766b..09a1993f4d7 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -113,6 +113,82 @@ typedef struct HeapTupleFreeze
 	OffsetNumber offset;
 } HeapTupleFreeze;
 
+/*
+ * State used by VACUUM to track the details of freezing all eligible tuples
+ * on a given heap page.
+ *
+ * VACUUM prepares freeze plans for each page via heap_prepare_freeze_tuple
+ * calls (every tuple with storage gets its own call).  This page-level freeze
+ * state is updated across each call, which ultimately determines whether or
+ * not freezing the page is required.
+ *
+ * Aside from the basic question of whether or not freezing will go ahead, the
+ * state also tracks the oldest extant XID/MXID in the table as a whole, for
+ * the purposes of advancing relfrozenxid/relminmxid values in pg_class later
+ * on.  Each heap_prepare_freeze_tuple call pushes NewRelfrozenXid and/or
+ * NewRelminMxid back as required to avoid unsafe final pg_class values.  Any
+ * and all unfrozen XIDs or MXIDs that remain after VACUUM finishes _must_
+ * have values >= the final relfrozenxid/relminmxid values in pg_class.  This
+ * includes XIDs that remain as MultiXact members from any tuple's xmax.
+ *
+ * When 'freeze_required' flag isn't set after all tuples are examined, the
+ * final choice on freezing is made by vacuumlazy.c.  It can decide to trigger
+ * freezing based on whatever criteria it deems appropriate.  However, it is
+ * recommended that vacuumlazy.c avoid early freezing when freezing does not
+ * enable setting the target page all-frozen in the visibility map afterwards.
+ */
+typedef struct HeapPageFreeze
+{
+	/* Is heap_prepare_freeze_tuple caller required to freeze page? */
+	bool		freeze_required;
+
+	/*
+	 * "Freeze" NewRelfrozenXid/NewRelminMxid trackers.
+	 *
+	 * Trackers used when heap_freeze_execute_prepared freezes the page, and
+	 * when page is "nominally frozen", which happens with pages where every
+	 * call to heap_prepare_freeze_tuple produced no usable freeze plan.
+	 *
+	 * "Nominal freezing" enables vacuumlazy.c's approach of setting a page
+	 * all-frozen in the visibility map when every tuple's 'totally_frozen'
+	 * result is true.  That always works in the same way, independent of the
+	 * need to freeze tuples, and without complicating the general rule around
+	 * 'totally_frozen' results (which is that 'totally_frozen' results are
+	 * only to be trusted with a page that goes on to be frozen by caller).
+	 *
+	 * When we freeze a page, we generally freeze all XIDs < OldestXmin, only
+	 * leaving behind XIDs that are ineligible for freezing, if any.  And so
+	 * you might wonder why these trackers are necessary at all; why should
+	 * _any_ page that VACUUM freezes _ever_ be left with XIDs/MXIDs that
+	 * ratchet back the top-level NewRelfrozenXid/NewRelminMxid trackers?
+	 *
+	 * It is useful to use a definition of "freeze the page" that does not
+	 * overspecify how MultiXacts are affected.  heap_prepare_freeze_tuple
+	 * generally prefers to remove Multis eagerly, but lazy processing is used
+	 * in cases where laziness allows VACUUM to avoid allocating a new Multi.
+	 * The "freeze the page" trackers enable this flexibility.
+	 */
+	TransactionId FreezePageRelfrozenXid;
+	MultiXactId FreezePageRelminMxid;
+
+	/*
+	 * "No freeze" NewRelfrozenXid/NewRelminMxid trackers.
+	 *
+	 * These trackers are maintained in the same way as the trackers used when
+	 * VACUUM scans a page that isn't cleanup locked.  Both code paths are
+	 * based on the same general idea (do less work for this page during the
+	 * ongoing VACUUM, at the cost of having to accept older final values).
+	 *
+	 * When vacuumlazy.c caller decides to do "no freeze" processing, it must
+	 * not go on to set the page all-frozen (setting the page all-visible
+	 * could still be okay).  heap_prepare_freeze_tuple's 'totally_frozen'
+	 * results can only be used on a page that also gets frozen as instructed.
+	 */
+	TransactionId NoFreezePageRelfrozenXid;
+	MultiXactId NoFreezePageRelminMxid;
+
+} HeapPageFreeze;
+
 /* ----------------
  *		function prototypes for heap access method
  *
@@ -180,19 +256,18 @@ extern TM_Result heap_lock_tuple(Relation relation, HeapTuple tuple,
 extern void heap_inplace_update(Relation relation, HeapTuple tuple);
 extern bool heap_prepare_freeze_tuple(HeapTupleHeader tuple,
 									  const struct VacuumCutoffs *cutoffs,
-									  HeapTupleFreeze *frz, bool *totally_frozen,
-									  TransactionId *relfrozenxid_out,
-									  MultiXactId *relminmxid_out);
+									  HeapPageFreeze *pagefrz,
+									  HeapTupleFreeze *frz, bool *totally_frozen);
 extern void heap_freeze_execute_prepared(Relation rel, Buffer buffer,
-										 TransactionId FreezeLimit,
+										 TransactionId snapshotConflictHorizon,
 										 HeapTupleFreeze *tuples, int ntuples);
 extern bool heap_freeze_tuple(HeapTupleHeader tuple,
 							  TransactionId relfrozenxid, TransactionId relminmxid,
 							  TransactionId FreezeLimit, TransactionId MultiXactCutoff);
-extern bool heap_tuple_would_freeze(HeapTupleHeader tuple,
-									const struct VacuumCutoffs *cutoffs,
-									TransactionId *relfrozenxid_out,
-									MultiXactId *relminmxid_out);
+extern bool heap_tuple_should_freeze(HeapTupleHeader tuple,
+									 const struct VacuumCutoffs *cutoffs,
+									 TransactionId *NoFreezePageRelfrozenXid,
+									 MultiXactId *NoFreezePageRelminMxid);
 extern bool heap_tuple_needs_eventual_freeze(HeapTupleHeader tuple);
 
 extern void simple_heap_insert(Relation relation, HeapTuple tup);
author	Peter Geoghegan	2022-12-28 16:50:47 +0000
committer	Peter Geoghegan	2022-12-28 16:50:47 +0000
commit	1de58df4fec7325d91f5a8345757314be7ac05da (patch)
tree	2f9012b64175aaffb82101aa5bd424ced1e13a26 /src/include
parent	7a05425d96742acff5ebfacf307711385c88429b (diff)