Using an Assert to check the validity of incoming messages is an
extremely poor decision. In a debug build, it should not be that easy
for a broken or malicious remote client to crash the logrep worker.
The consequences could be even worse in non-debug builds, which will
fail to make such checks at all, leading to who-knows-what misbehavior.
Hence, promote every Assert that could possibly be triggered by wrong
or out-of-order replication messages to a full test-and-ereport.
To avoid bloating the set of messages the translation team has to cope
with, establish a policy that replication protocol violation error
reports don't need to be translated. Hence, all the new messages here
use errmsg_internal(). A couple of old messages are changed likewise
for consistency.
Along the way, fix some non-idiomatic or outright wrong uses of
hash_search().
Most of these mistakes are new with the "streaming replication"
patch (commit
464824323), but a couple go back a long way.
Back-patch as appropriate.
Discussion: https://postgr.es/m/
1719083.
1623351052@sss.pgh.pa.us
ent = (ReorderBufferTupleCidEnt *)
hash_search(txn->tuplecid_hash,
(void *) &key,
- HASH_ENTER | HASH_FIND,
+ HASH_ENTER,
&found);
if (!found)
{
logicalrep_read_commit(s, &commit_data);
- Assert(commit_data.commit_lsn == remote_final_lsn);
+ if (commit_data.commit_lsn != remote_final_lsn)
+ ereport(ERROR,
+ (errcode(ERRCODE_PROTOCOL_VIOLATION),
+ errmsg_internal("incorrect commit LSN %X/%X in commit message (expected %X/%X)",
+ (uint32) (commit_data.commit_lsn >> 32),
+ (uint32) commit_data.commit_lsn,
+ (uint32) (remote_final_lsn >> 32),
+ (uint32) remote_final_lsn)));
/* The synchronization worker runs in single transaction. */
if (IsTransactionState() && !am_tablesync_worker())