pgsql: Fix datalen calculation in tsvectorrecv().

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-committers(at)lists(dot)postgresql(dot)org
Subject: pgsql: Fix datalen calculation in tsvectorrecv().
Date: 2023-10-01 17:19:54
Message-ID: E1qn06k-0070kl-ME@gemulon.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers

Fix datalen calculation in tsvectorrecv().

After receiving position data for a lexeme, tsvectorrecv()
advanced its "datalen" value by (npos+1)*sizeof(WordEntry)
where the correct calculation is (npos+1)*sizeof(WordEntryPos).
This accidentally failed to render the constructed tsvector
invalid, but it did result in leaving some wasted space
approximately equal to the space consumed by the position data.
That could have several bad effects:

* Disk space is wasted if the received tsvector is stored into a
table as-is.

* A legal tsvector could get rejected with "maximum total lexeme
length exceeded" if the extra space pushes it over the MAXSTRPOS
limit.

* In edge cases, the finished tsvector could be assigned a length
larger than the allocated size of its palloc chunk, conceivably
leading to SIGSEGV when the tsvector gets copied somewhere else.
The odds of a field failure of this sort seem low, though valgrind
testing could probably have found this.

While we're here, let's express the calculation as
"sizeof(uint16) + npos * sizeof(WordEntryPos)" to avoid the type
pun implicit in the "npos + 1" formulation. It's not wrong
given that WordEntryPos had better be 2 bytes to avoid padding
problems, but it seems clearer this way.

Report and patch by Denis Erokhin. Back-patch to all supported
versions.

Discussion: https://postgr.es/m/009801d9f2d9$f29730c0$d7c59240$@datagile.ru

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/5b7b3824648d6324f649bc74713a6b35e53b91ac

Modified Files
--------------
src/backend/utils/adt/tsvector.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

Browse pgsql-committers by date

  From Date Subject
Next Message Noah Misch 2023-10-01 19:21:16 pgsql: Correct assertion and comments about XLogRecordMaxSize.
Previous Message Tom Lane 2023-10-01 16:09:39 pgsql: In COPY FROM, fail cleanly when unsupported encoding conversion