Build de-escaped JSON strings in larger chunks during lexing
authorJohn Naylor <john.naylor@postgresql.org>
Fri, 1 Jul 2022 10:28:20 +0000 (17:28 +0700)
committerJohn Naylor <john.naylor@postgresql.org>
Mon, 11 Jul 2022 04:11:36 +0000 (11:11 +0700)
commit3838fa269c15706df2b85ce2d6af8aacd5611655
tree7c9c473754716f3a4fa661db6a133b9dac6dc309
parenta6434b951558baad8372dc4b83bf87606dac9cda
Build de-escaped JSON strings in larger chunks during lexing

During COPY BINARY with large JSONB blobs, it was found that half
the time was spent parsing JSON, with much of that spent in separate
appendStringInfoChar() calls for each input byte.

Add lookahead loop to json_lex_string() to allow batching multiple bytes
via appendBinaryStringInfo(). Also use this same logic when de-escaping
is not done, to avoid code duplication.

Report and proof of concept patch by Jelte Fennema, reworked by Andres
Freund and John Naylor

Discussion: https://www.postgresql.org/message-id/CAGECzQQuXbies_nKgSiYifZUjBk6nOf2%3DTSXqRjj2BhUh8CTeA%40mail.gmail.com
Discussion: https://www.postgresql.org/message-id/flat/PR3PR83MB0476F098CBCF68AF7A1CA89FF7B49@PR3PR83MB0476.EURPRD83.prod.outlook.com
src/common/jsonapi.c