summaryrefslogtreecommitdiff
path: root/src/include/common
diff options
context:
space:
mode:
authorTom Lane2019-01-10 00:47:38 +0000
committerTom Lane2019-01-10 00:47:46 +0000
commitc64d0cd5ce24a344798534f1bc5827a9199b7a6e (patch)
tree3968456d54c3f18d07976e5a139ca60589a8fbf0 /src/include/common
parent5d59a6c5eaff4a58322683e450e76a11d943d322 (diff)
Use perfect hashing, instead of binary search, for keyword lookup.
We've been speculating for a long time that hash-based keyword lookup ought to be faster than binary search, but up to now we hadn't found a suitable tool for generating the hash function. Joerg Sonnenberger provided the inspiration, and sample code, to show us that rolling our own generator wasn't a ridiculous idea. Hence, do that. The method used here requires a lookup table of approximately 4 bytes per keyword, but that's less than what we saved in the predecessor commit afb0d0712, so it's not a big problem. The time savings is indeed significant: preliminary testing suggests that the total time for raw parsing (flex + bison phases) drops by ~20%. Patch by me, but it owes its existence to Joerg Sonnenberger; thanks also to John Naylor for review. Discussion: https://postgr.es/m/20190103163340.GA15803@britannica.bec.de
Diffstat (limited to 'src/include/common')
-rw-r--r--src/include/common/kwlookup.h4
1 files changed, 4 insertions, 0 deletions
diff --git a/src/include/common/kwlookup.h b/src/include/common/kwlookup.h
index 39efb3503fc..dbff36713d6 100644
--- a/src/include/common/kwlookup.h
+++ b/src/include/common/kwlookup.h
@@ -14,6 +14,9 @@
#ifndef KWLOOKUP_H
#define KWLOOKUP_H
+/* Hash function used by ScanKeywordLookup */
+typedef int (*ScanKeywordHashFunc) (const void *key, size_t keylen);
+
/*
* This struct contains the data needed by ScanKeywordLookup to perform a
* search within a set of keywords. The contents are typically generated by
@@ -23,6 +26,7 @@ typedef struct ScanKeywordList
{
const char *kw_string; /* all keywords in order, separated by \0 */
const uint16 *kw_offsets; /* offsets to the start of each keyword */
+ ScanKeywordHashFunc hash; /* perfect hash function for keywords */
int num_keywords; /* number of keywords */
int max_kw_len; /* length of longest keyword */
} ScanKeywordList;