When ComputeXidHorizons() was called before MyDatabaseOid is set,
e.g. because a dead row in a shared relation is encountered during
InitPostgres(), the horizon for normal tables was computed too
aggressively, ignoring all backends connected to a database.
During subsequent pruning in a data table the too aggressive horizon
could end up still being used, possibly leading to still needed tuples
being removed. Not good.
This is a bug in
dc7420c2c92, which the test added in
94bc27b5768 made
visible, if run with force_parallel_mode set to regress. In that case
the bug is reliably triggered, because "pruning_query" is run in a
parallel worker and the start of that parallel worker is likely to
encounter a dead row in pg_database.
The fix is trivial: Compute a more pessimistic data table horizon if
MyDatabaseId is not yet known.
Author: Andres Freund
Discussion: https://postgr.es/m/
20201029040030.p4osrmaywhqaesd4@alap3.anarazel.de
* the shared horizon. But in recovery we cannot compute an accurate
* per-database horizon as all xids are managed via the
* KnownAssignedXids machinery.
+ *
+ * Be careful to compute a pessimistic value when MyDatabaseId is not
+ * set. If this is a backend in the process of starting up, we may not
+ * use a "too aggressive" horizon (otherwise we could end up using it
+ * to prune still needed data away). If the current backend never
+ * connects to a database that is harmless, because
+ * data_oldest_nonremovable will never be utilized.
*/
if (in_recovery ||
- proc->databaseId == MyDatabaseId ||
+ MyDatabaseId == InvalidOid || proc->databaseId == MyDatabaseId ||
proc->databaseId == 0) /* always include WalSender */
{
h->data_oldest_nonremovable =