Fix background worker not restarting after crash-and-restart cycle.
authorFujii Masao <fujii@postgresql.org>
Fri, 25 Jul 2025 09:38:36 +0000 (18:38 +0900)
committerFujii Masao <fujii@postgresql.org>
Fri, 25 Jul 2025 09:38:36 +0000 (18:38 +0900)
Previously, if a background worker crashed (e.g., due to a SIGKILL) and
the server restarted due to restart_after_crash being enabled,
the worker was not restarted as expected. Background workers without
the never-restart flag should automatically restart in this case.

This issue was introduced in commit 28a520c0b77, which failed to reset
the rw_pid field in the RegisteredBgWorker struct for the crashed worker.

This commit fixes the problem by resetting rw_pid for all eligible
background workers during the crash-and-restart cycle.

Back-patched to v18, where the bug was introduced.

Bug fix patches were proposed by Andrey Rudometov and ChangAo Chen,
but this commit uses a different approach.

Reported-by: Andrey Rudometov <unlimitedhikari@gmail.com>
Reported-by: ChangAo Chen <cca5507@qq.com>
Author: Andrey Rudometov <unlimitedhikari@gmail.com>
Author: ChangAo Chen <cca5507@qq.com>
Co-authored-by: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: ChangAo Chen <cca5507@qq.com>
Reviewed-by: Shveta Malik <shveta.malik@gmail.com>
Discussion: https://postgr.es/m/CAF6JsWiO=i24qYitWe6ns1sXqcL86rYxdyU+pNYk-WueKPSySg@mail.gmail.com
Discussion: https://postgr.es/m/tencent_E00A056B3953EE6440F0F40F80EC30427D09@qq.com
Backpatch-through: 18

src/backend/postmaster/bgworker.c
src/backend/postmaster/postmaster.c

index 116ddf7b835f16dcd0a2fc2de871729c48421f63..1ad65c237c34ed44f03558648dc07d75d8f4e470 100644 (file)
@@ -613,6 +613,7 @@ ResetBackgroundWorkerCrashTimes(void)
             * resetting.
             */
            rw->rw_crashed_at = 0;
+           rw->rw_pid = 0;
 
            /*
             * If there was anyone waiting for it, they're history.
index cca9b946e5384af4743a47a538c8f7ed20baf271..e01d9f0cfe81e1067a1d372327ae8b7759e2869d 100644 (file)
@@ -2630,6 +2630,13 @@ CleanupBackend(PMChild *bp,
    }
    bp = NULL;
 
+   /*
+    * In a crash case, exit immediately without resetting background worker
+    * state. However, if restart_after_crash is enabled, the background
+    * worker state (e.g., rw_pid) still needs be reset so the worker can
+    * restart after crash recovery. This reset is handled in
+    * ResetBackgroundWorkerCrashTimes(), not here.
+    */
    if (crashed)
    {
        HandleChildCrash(bp_pid, exitstatus, procname);