Trigger more frequent autovacuums with relallfrozen
authorMelanie Plageman <melanieplageman@gmail.com>
Mon, 3 Mar 2025 19:42:00 +0000 (14:42 -0500)
committerMelanie Plageman <melanieplageman@gmail.com>
Mon, 3 Mar 2025 19:42:00 +0000 (14:42 -0500)
Calculate the insert threshold for triggering an autovacuum of a
relation based on the number of unfrozen pages.

By only considering the unfrozen portion of the table when calculating
how many tuples to add to the insert threshold, we can trigger more
frequent vacuums of insert-heavy tables. This increases the chances of
vacuuming those pages when they still reside in shared buffers

This also increases the number of autovacuums triggered by tuples
inserted and not by wraparound risk. We prefer to freeze these pages
during insert-triggered autovacuums, as anti-wraparound vacuums are not
automatically canceled by conflicting lock requests.

We calculate the unfrozen percentage of the table using the recently
added (99f8f3fbbc8f) relallfrozen column of pg_class.

Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Greg Sabino Mullane <htamfids@gmail.com>
Reviewed-by: Robert Treat <rob@xzilla.net>
Reviewed-by: wenhui qiu <qiuwenhuifx@gmail.com>
Discussion: https://postgr.es/m/flat/CAAKRu_aj-P7YyBz_cPNwztz6ohP%2BvWis%3Diz3YcomkB3NpYA--w%40mail.gmail.com

doc/src/sgml/catalogs.sgml
doc/src/sgml/config.sgml
src/backend/postmaster/autovacuum.c
src/backend/utils/misc/postgresql.conf.sample

index 9a21a0d6f157c9277732fa25b8ffa21e6cdf0a73..fb05063555153dee7d3eaf276f7d5e64fc84607d 100644 (file)
@@ -2072,9 +2072,10 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</replaceable>:<replaceable>&l
       </para>
       <para>
        Number of pages that are marked all-frozen in the table's visibility
-       map.  This is only an estimate and can be used along with
-       <structfield>relallvisible</structfield> for scheduling vacuums and
-       tuning <link linkend="runtime-config-vacuum-freezing">vacuum's freezing
+       map.  This is only an estimate used for triggering autovacuums. It can
+       also be used along with <structfield>relallvisible</structfield> for
+       scheduling manual vacuums and tuning <link
+       linkend="runtime-config-vacuum-freezing">vacuum's freezing
        behavior</link>.
 
        It is updated by
index e55700f35b8973df6687ede9a19a176d33d1f557..d2fa5f7d1a9f15dbf3e0be25bf73f7b2c4973de5 100644 (file)
@@ -8773,14 +8773,13 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
        </term>
        <listitem>
         <para>
-         Specifies a fraction of the table size to add to
-         <varname>autovacuum_vacuum_insert_threshold</varname>
-         when deciding whether to trigger a <command>VACUUM</command>.
-         The default is <literal>0.2</literal> (20% of table size).
-         This parameter can only be set in the <filename>postgresql.conf</filename>
-         file or on the server command line;
-         but the setting can be overridden for individual tables by
-         changing table storage parameters.
+         Specifies a fraction of the unfrozen pages in the table to add to
+         <varname>autovacuum_vacuum_insert_threshold</varname> when deciding
+         whether to trigger a <command>VACUUM</command>. The default is
+         <literal>0.2</literal> (20% of unfrozen pages in table). This
+         parameter can only be set in the <filename>postgresql.conf</filename>
+         file or on the server command line; but the setting can be overridden
+         for individual tables by changing table storage parameters.
         </para>
        </listitem>
       </varlistentry>
index ddb303f5201b58b30535d167e073e1e5c839e693..dfb8d068ecfe60f99efeadd4af12425eea516494 100644 (file)
@@ -2938,7 +2938,6 @@ relation_needs_vacanalyze(Oid relid,
 {
    bool        force_vacuum;
    bool        av_enabled;
-   float4      reltuples;      /* pg_class.reltuples */
 
    /* constants from reloptions or GUC variables */
    int         vac_base_thresh,
@@ -3052,7 +3051,11 @@ relation_needs_vacanalyze(Oid relid,
     */
    if (PointerIsValid(tabentry) && AutoVacuumingActive())
    {
-       reltuples = classForm->reltuples;
+       float4      pcnt_unfrozen = 1;
+       float4      reltuples = classForm->reltuples;
+       int32       relpages = classForm->relpages;
+       int32       relallfrozen = classForm->relallfrozen;
+
        vactuples = tabentry->dead_tuples;
        instuples = tabentry->ins_since_vacuum;
        anltuples = tabentry->mod_since_analyze;
@@ -3061,11 +3064,29 @@ relation_needs_vacanalyze(Oid relid,
        if (reltuples < 0)
            reltuples = 0;
 
+       /*
+        * If we have data for relallfrozen, calculate the unfrozen percentage
+        * of the table to modify insert scale factor. This helps us decide
+        * whether or not to vacuum an insert-heavy table based on the number
+        * of inserts to the more "active" part of the table.
+        */
+       if (relpages > 0 && relallfrozen > 0)
+       {
+           /*
+            * It could be the stats were updated manually and relallfrozen >
+            * relpages. Clamp relallfrozen to relpages to avoid nonsensical
+            * calculations.
+            */
+           relallfrozen = Min(relallfrozen, relpages);
+           pcnt_unfrozen = 1 - ((float4) relallfrozen / relpages);
+       }
+
        vacthresh = (float4) vac_base_thresh + vac_scale_factor * reltuples;
        if (vac_max_thresh >= 0 && vacthresh > (float4) vac_max_thresh)
            vacthresh = (float4) vac_max_thresh;
 
-       vacinsthresh = (float4) vac_ins_base_thresh + vac_ins_scale_factor * reltuples;
+       vacinsthresh = (float4) vac_ins_base_thresh +
+           vac_ins_scale_factor * reltuples * pcnt_unfrozen;
        anlthresh = (float4) anl_base_thresh + anl_scale_factor * reltuples;
 
        /*
index 5362ff805195f9a059c412dbbe6027824088f446..2d1de9c37bd17b8dc2c4f8fda597655bdbd6021f 100644 (file)
@@ -675,8 +675,8 @@ autovacuum_worker_slots = 16    # autovacuum worker slots to allocate
 #autovacuum_analyze_threshold = 50 # min number of row updates before
                    # analyze
 #autovacuum_vacuum_scale_factor = 0.2  # fraction of table size before vacuum
-#autovacuum_vacuum_insert_scale_factor = 0.2   # fraction of inserts over table
-                       # size before insert vacuum
+#autovacuum_vacuum_insert_scale_factor = 0.2   # fraction of unfrozen pages
+          # before insert vacuum
 #autovacuum_analyze_scale_factor = 0.1 # fraction of table size before analyze
 #autovacuum_vacuum_max_threshold = 100000000    # max number of row updates
                        # before vacuum; -1 disables max