summaryrefslogtreecommitdiff
path: root/src/tools/PerfectHash.pm
AgeCommit message (Collapse)Author
2023-01-02Update copyright for 2023Bruce Momjian
Backpatch-through: 11
2022-05-12Pre-beta mechanical code beautification.Tom Lane
Run pgindent, pgperltidy, and reformat-dat-files. I manually fixed a couple of comments that pgindent uglified.
2022-01-08Update copyright for 2022Bruce Momjian
Backpatch-through: 10
2021-08-12Speed up generation of Unicode hash functions.John Naylor
Sets of Unicode keys are picky about the primes used when generating a perfect hash function for them. Callers can spend many seconds iterating through all the possible combinations of candidate multipliers and seeds to find one that works. Unicode updates typically happen only once a year, but it still makes development and testing of Unicode scripts unnecessarily slow. To fix, iterate over the primes in the innermost loop. This does not change any existing functions checked into the tree.
2021-01-02Update copyright for 2021Bruce Momjian
Backpatch-through: 9.5
2020-10-21Review format of code generated by PerfectHash.pmMichael Paquier
80f8eb7 has added to the normalization quick check headers some code generated by PerfectHash.pm that is incompatible with the settings of gitattributes for this repository, as whitespaces followed a set of tabs for the first element of a line in the table. Instead of adding a new exception to gitattributes, rework the format generated so as a right padding with spaces is used instead of a left padding. This keeps the table generated in a readable shape with its set of columns, making unnecessary an update of gitattributes. Reported-by: Peter Eisentraut Author: John Naylor Discussion: https://postgr.es/m/d601b3b5-a3c7-5457-2f84-3d6513d690fc@2ndquadrant.com
2020-10-08Improve set of candidate multipliers for perfect hash function generationMichael Paquier
The previous set of multipliers was not adapted for large sets of short keys, and this new set of multipliers allows to generate perfect hash functions for larger sets without having an impact for existing callers of those functions, as experimentation has showed. A future commit will make use of that to improve the performance of unicode normalization. All multipliers compile to shift-and-add instructions on most platforms. This has been tested as far back as gcc 4.1 and clang 3.8. Author: John Naylor Reviewed-by: Mark Dilger, Michael Paquier Discussion: https://postgr.es/m/CACPNZCt4fbJ0_bGrN5QPt34N4whv=mszM0LMVQdoa2rC9UMRXA@mail.gmail.com
2020-01-01Update copyrights for 2020Bruce Momjian
Backpatch-through: update all files in master, backpatch legal files through 9.4
2019-05-31Make our perfect hash functions be valid C++.Tom Lane
While C is happy to cast "const void *" to "const unsigned char *" silently, C++ insists on an explicit cast. Since we put these functions into header files, cpluspluscheck whines about that. Add the cast to pacify it. Discussion: https://postgr.es/m/b517ec3918d645eb950505eac8dd434e@gaz-is.ru
2019-01-10Use perfect hashing, instead of binary search, for keyword lookup.Tom Lane
We've been speculating for a long time that hash-based keyword lookup ought to be faster than binary search, but up to now we hadn't found a suitable tool for generating the hash function. Joerg Sonnenberger provided the inspiration, and sample code, to show us that rolling our own generator wasn't a ridiculous idea. Hence, do that. The method used here requires a lookup table of approximately 4 bytes per keyword, but that's less than what we saved in the predecessor commit afb0d0712, so it's not a big problem. The time savings is indeed significant: preliminary testing suggests that the total time for raw parsing (flex + bison phases) drops by ~20%. Patch by me, but it owes its existence to Joerg Sonnenberger; thanks also to John Naylor for review. Discussion: https://postgr.es/m/20190103163340.GA15803@britannica.bec.de