diff options
| author | Tom Lane | 2019-01-10 00:47:38 +0000 |
|---|---|---|
| committer | Tom Lane | 2019-01-10 00:47:46 +0000 |
| commit | c64d0cd5ce24a344798534f1bc5827a9199b7a6e (patch) | |
| tree | 3968456d54c3f18d07976e5a139ca60589a8fbf0 /src/include/common | |
| parent | 5d59a6c5eaff4a58322683e450e76a11d943d322 (diff) | |
Use perfect hashing, instead of binary search, for keyword lookup.
We've been speculating for a long time that hash-based keyword lookup
ought to be faster than binary search, but up to now we hadn't found
a suitable tool for generating the hash function. Joerg Sonnenberger
provided the inspiration, and sample code, to show us that rolling our
own generator wasn't a ridiculous idea. Hence, do that.
The method used here requires a lookup table of approximately 4 bytes
per keyword, but that's less than what we saved in the predecessor commit
afb0d0712, so it's not a big problem. The time savings is indeed
significant: preliminary testing suggests that the total time for raw
parsing (flex + bison phases) drops by ~20%.
Patch by me, but it owes its existence to Joerg Sonnenberger;
thanks also to John Naylor for review.
Discussion: https://postgr.es/m/20190103163340.GA15803@britannica.bec.de
Diffstat (limited to 'src/include/common')
| -rw-r--r-- | src/include/common/kwlookup.h | 4 |
1 files changed, 4 insertions, 0 deletions
diff --git a/src/include/common/kwlookup.h b/src/include/common/kwlookup.h index 39efb3503fc..dbff36713d6 100644 --- a/src/include/common/kwlookup.h +++ b/src/include/common/kwlookup.h @@ -14,6 +14,9 @@ #ifndef KWLOOKUP_H #define KWLOOKUP_H +/* Hash function used by ScanKeywordLookup */ +typedef int (*ScanKeywordHashFunc) (const void *key, size_t keylen); + /* * This struct contains the data needed by ScanKeywordLookup to perform a * search within a set of keywords. The contents are typically generated by @@ -23,6 +26,7 @@ typedef struct ScanKeywordList { const char *kw_string; /* all keywords in order, separated by \0 */ const uint16 *kw_offsets; /* offsets to the start of each keyword */ + ScanKeywordHashFunc hash; /* perfect hash function for keywords */ int num_keywords; /* number of keywords */ int max_kw_len; /* length of longest keyword */ } ScanKeywordList; |
