When COPYing into a partitioned table that does now permit the use of
table_multi_insert(), we could error out with
ERROR: could not read block NN in file "base/...": read only 0 of 8192 bytes
because BulkInsertState->next_free was not reset between partitions. This
problem occurred only when not able to use table_multi_insert(), as a
dedicated BulkInsertState for each partition is used in that case.
The bug was introduced in
00d1e02be24, but it was hard to hit at that point,
as commonly bulk relation extension is not used when not using
table_multi_insert(). It became more likely after
82a4edabd27, which expanded
the use of bulk extension.
To fix the bug, reset the bulk relation extension state in BulkInsertState in
ReleaseBulkInsertStatePin(). That was added (in
b1ecb9b3fcf) to tackle a very
similar issue. Obviously the name is not quite correct, but there might be
external callers, and bulk insert state needs to be reset in precisely in the
situations that ReleaseBulkInsertStatePin() already needed to be called.
Medium term the better fix likely is to disallow reusing BulkInsertState
across relations.
Add a test that, without the fix, reproduces #18130 in most
configurations. The test also catches the problem fixed in
b1ecb9b3fcf when
run with small shared_buffers.
Reported-by: Ivan Kolombet <enderstd@gmail.com>
Analyzed-by: Tom Lane <tgl@sss.pgh.pa.us>
Analyzed-by: Andres Freund <andres@anarazel.de>
Bug: #18130
Discussion: https://postgr.es/m/18130-
7a86a7356a75209d%40postgresql.org
Discussion: https://postgr.es/m/257696.
1695670946%40sss.pgh.pa.us
Backpatch: 16-
if (bistate->current_buf != InvalidBuffer)
ReleaseBuffer(bistate->current_buf);
bistate->current_buf = InvalidBuffer;
+
+ /*
+ * Despite the name, we also reset bulk relation extension
+ * state. Otherwise we can end up erroring out due to looking for free
+ * space in ->next_free of one partition, even though ->next_free was set
+ * when extending another partition. It's obviously also could be bad for
+ * efficiency to look at existing blocks at offsets from another
+ * partition, even if we don't error out.
+ */
+ bistate->next_free = InvalidBlockNumber;
+ bistate->last_free = InvalidBlockNumber;
}
\.
invalid command \.
drop table oversized_column_default;
+--
+-- Create partitioned table that does not allow bulk insertions, to test bugs
+-- related to the reuse of BulkInsertState across partitions (only done when
+-- not using bulk insert). Switching between partitions often makes it more
+-- likely to encounter these bugs, so we just switch on roughly every insert
+-- by having an even/odd number partition and inserting evenly distributed
+-- data.
+--
+CREATE TABLE parted_si (
+ id int not null,
+ data text not null,
+ -- prevent use of bulk insert by having a volatile function
+ rand float8 not null default random()
+)
+PARTITION BY LIST((id % 2));
+CREATE TABLE parted_si_p_even PARTITION OF parted_si FOR VALUES IN (0);
+CREATE TABLE parted_si_p_odd PARTITION OF parted_si FOR VALUES IN (1);
+-- Test that bulk relation extension handles reusing a single BulkInsertState
+-- across partitions. Without the fix applied, this reliably reproduces
+-- #18130 unless shared_buffers is extremely small (preventing any use use of
+-- bulk relation extension). See
+-- https://postgr.es/m/18130-7a86a7356a75209d%40postgresql.org
+-- https://postgr.es/m/257696.1695670946%40sss.pgh.pa.us
+\set filename :abs_srcdir '/data/desc.data'
+COPY parted_si(id, data) FROM :'filename';
+-- An earlier bug (see commit b1ecb9b3fcf) could end up using a buffer from
+-- the wrong partition. This test is *not* guaranteed to trigger that bug, but
+-- does so when shared_buffers is small enough. To test if we encountered the
+-- bug, check that the partition condition isn't violated.
+SELECT tableoid::regclass, id % 2 = 0 is_even, count(*) from parted_si GROUP BY 1, 2 ORDER BY 1;
+ tableoid | is_even | count
+------------------+---------+-------
+ parted_si_p_even | t | 5000
+ parted_si_p_odd | f | 5000
+(2 rows)
+
+DROP TABLE parted_si;
copy oversized_column_default from stdin (default '');
\.
drop table oversized_column_default;
+
+
+--
+-- Create partitioned table that does not allow bulk insertions, to test bugs
+-- related to the reuse of BulkInsertState across partitions (only done when
+-- not using bulk insert). Switching between partitions often makes it more
+-- likely to encounter these bugs, so we just switch on roughly every insert
+-- by having an even/odd number partition and inserting evenly distributed
+-- data.
+--
+CREATE TABLE parted_si (
+ id int not null,
+ data text not null,
+ -- prevent use of bulk insert by having a volatile function
+ rand float8 not null default random()
+)
+PARTITION BY LIST((id % 2));
+
+CREATE TABLE parted_si_p_even PARTITION OF parted_si FOR VALUES IN (0);
+CREATE TABLE parted_si_p_odd PARTITION OF parted_si FOR VALUES IN (1);
+
+-- Test that bulk relation extension handles reusing a single BulkInsertState
+-- across partitions. Without the fix applied, this reliably reproduces
+-- #18130 unless shared_buffers is extremely small (preventing any use use of
+-- bulk relation extension). See
+-- https://postgr.es/m/18130-7a86a7356a75209d%40postgresql.org
+-- https://postgr.es/m/257696.1695670946%40sss.pgh.pa.us
+\set filename :abs_srcdir '/data/desc.data'
+COPY parted_si(id, data) FROM :'filename';
+
+-- An earlier bug (see commit b1ecb9b3fcf) could end up using a buffer from
+-- the wrong partition. This test is *not* guaranteed to trigger that bug, but
+-- does so when shared_buffers is small enough. To test if we encountered the
+-- bug, check that the partition condition isn't violated.
+SELECT tableoid::regclass, id % 2 = 0 is_even, count(*) from parted_si GROUP BY 1, 2 ORDER BY 1;
+
+DROP TABLE parted_si;