summaryrefslogtreecommitdiff
path: root/src/tools
diff options
context:
space:
mode:
authorMichael Paquier2020-10-23 02:05:46 +0000
committerMichael Paquier2020-10-23 02:05:46 +0000
commit783f0cc64dcc05e3d112a06b1cd181e5a1ca9099 (patch)
treeea1b0834526609807e36d5618a8ccb14147bdb1b /src/tools
parent7d6d6bce43c60bb7b77237e2cc6ab845646b911f (diff)
Improve performance of Unicode {de,re}composition in the backend
This replaces the existing binary search with two perfect hash functions for the composition and the decomposition in the backend code, at the cost of slightly-larger binaries there (35kB in libpgcommon_srv.a). Per the measurements done, this improves the speed of the recomposition and decomposition by up to 30~40 times for the NFC and NFKC conversions, while all other operations get at least 40% faster. This is not as "good" as what libicu has, but it closes the gap a lot as per the feedback from Daniel Verite. The decomposition table remains the same, getting used for the binary search in the frontend code, where we care more about the size of the libraries like libpq over performance as this gets involved only in code paths related to the SCRAM authentication. In consequence, note that the perfect hash function for the recomposition needs to use a new inverse lookup array back to to the existing decomposition table. The size of all frontend deliverables remains unchanged, even with --enable-debug, including libpq. Author: John Naylor Reviewed-by: Michael Paquier, Tom Lane Discussion: https://postgr.es/m/CAFBsxsHUuMFCt6-pU+oG-F1==CmEp8wR+O+bRouXWu6i8kXuqA@mail.gmail.com
Diffstat (limited to 'src/tools')
-rw-r--r--src/tools/pgindent/exclude_file_patterns3
1 files changed, 2 insertions, 1 deletions
diff --git a/src/tools/pgindent/exclude_file_patterns b/src/tools/pgindent/exclude_file_patterns
index 86bdd9d6dcb..f08180b0d08 100644
--- a/src/tools/pgindent/exclude_file_patterns
+++ b/src/tools/pgindent/exclude_file_patterns
@@ -18,9 +18,10 @@ src/backend/utils/fmgrprotos\.h$
# they match pgindent style, they'd look worse not better, so exclude them.
kwlist_d\.h$
#
-# This is generated by the scripts from src/common/unicode/. It uses
+# These are generated by the scripts from src/common/unicode/. They use
# hash functions generated by PerfectHash.pm whose format looks worse with
# pgindent.
+src/include/common/unicode_norm_hashfunc\.h$
src/include/common/unicode_normprops_table\.h$
#
# Exclude ecpg test files to avoid breaking the ecpg regression tests