Fix consistency issues with replication slot copy
authorAlvaro Herrera <alvherre@alvh.no-ip.org>
Tue, 17 Mar 2020 19:13:18 +0000 (16:13 -0300)
committerAlvaro Herrera <alvherre@alvh.no-ip.org>
Tue, 17 Mar 2020 19:13:18 +0000 (16:13 -0300)
commitbcd1c3630095e48bc3b1eb0fc8e8c8a7c851eba1
treed09a3491375f9a49aae88519b0baf601e9bbb64c
parent31d846e0265c2c1415d7910d39d5b259b92184ea
Fix consistency issues with replication slot copy

Commit 9f06d79ef831's replication slot copying failed to
properly reserve the WAL that the slot is expecting to see
during DecodingContextFindStartpoint (to set the confirmed_flush
LSN), so concurrent activity could remove that WAL and cause the
copy process to error out.  But it doesn't actually *need* that
WAL anyway: instead of running decode to find confirmed_flush, it
can be copied from the source slot. Fix this by rearranging things
to avoid DecodingContextFindStartpoint() (leaving the target slot's
confirmed_flush_lsn to invalid), and set that up afterwards by copying
from the target slot's value.

Also ensure the source slot's confirmed_flush_lsn is valid.

Reported-by: Arseny Sher
Author: Masahiko Sawada, Arseny Sher
Discussion: https://postgr.es/m/871rr3ohbo.fsf@ars-thinkpad
src/backend/replication/logical/logical.c
src/backend/replication/slotfuncs.c