summaryrefslogtreecommitdiff
path: root/src/backend/access
diff options
context:
space:
mode:
authorTom Lane2022-07-28 18:34:32 +0000
committerTom Lane2022-07-28 18:34:32 +0000
commite09d7a1262c659578065eaf7edafe606d2c8ebf2 (patch)
treea20bedfb3536af659716e32d756a9103954ab732 /src/backend/access
parent70a437aa45b6dcacc2ad894f95ef5bb46b26035f (diff)
Improve speed of hash index build.
In the initial data sort, if the bucket numbers are the same then next sort on the hash value. Because index pages are kept in hash value order, this gains a little speed by allowing the eventual tuple insertions to be done sequentially, avoiding repeated data movement within PageAddItem. This seems to be good for overall speedup of 5%-9%, depending on the incoming data. Simon Riggs, reviewed by Amit Kapila Discussion: https://postgr.es/m/CANbhV-FG-1ZNMBuwhUF7AxxJz3u5137dYL-o6hchK1V_dMw86g@mail.gmail.com
Diffstat (limited to 'src/backend/access')
-rw-r--r--src/backend/access/hash/hashsort.c7
1 files changed, 4 insertions, 3 deletions
diff --git a/src/backend/access/hash/hashsort.c b/src/backend/access/hash/hashsort.c
index aa61e39f26a..19563148d05 100644
--- a/src/backend/access/hash/hashsort.c
+++ b/src/backend/access/hash/hashsort.c
@@ -42,9 +42,10 @@ struct HSpool
Relation index;
/*
- * We sort the hash keys based on the buckets they belong to. Below masks
- * are used in _hash_hashkey2bucket to determine the bucket of given hash
- * key.
+ * We sort the hash keys based on the buckets they belong to, then by the
+ * hash values themselves, to optimize insertions onto hash pages. The
+ * masks below are used in _hash_hashkey2bucket to determine the bucket of
+ * a given hash key.
*/
uint32 high_mask;
uint32 low_mask;