Revise hash join and hash aggregation code to use the same datatype-

author Tom Lane <tgl@sss.pgh.pa.us>

Sun, 22 Jun 2003 22:04:55 +0000 (22:04 +0000)

committer Tom Lane <tgl@sss.pgh.pa.us>

Sun, 22 Jun 2003 22:04:55 +0000 (22:04 +0000)
author Tom Lane <tgl@sss.pgh.pa.us>
Sun, 22 Jun 2003 22:04:55 +0000 (22:04 +0000)
committer Tom Lane <tgl@sss.pgh.pa.us>
Sun, 22 Jun 2003 22:04:55 +0000 (22:04 +0000)
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml

index a8f7190856c75fd46e208929f0d45c43a8940f46..835739d81bb27a5340511d1380ac7f9cc891bddb 100644 (file)
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -1,6 +1,6 @@
  <!--
   Documentation of the system catalogs, directed toward PostgreSQL developers
- $Header: /cvsroot/pgsql/doc/src/sgml/catalogs.sgml,v 2.71 2003/05/28 16:03:55 tgl Exp $
+ $Header: /cvsroot/pgsql/doc/src/sgml/catalogs.sgml,v 2.72 2003/06/22 22:04:54 tgl Exp $
   -->
  
  <chapter id="catalogs">
@@ -2525,7 +2525,7 @@
        <entry><structfield>oprcanhash</structfield></entry>
        <entry><type>bool</type></entry>
        <entry></entry>
-      <entry>This operator supports hash joins.</entry>
+      <entry>This operator supports hash joins</entry>
       </row>
  
       <row>
diff --git a/doc/src/sgml/xfunc.sgml b/doc/src/sgml/xfunc.sgml

index f6298a0eccaa841d624454df1f2e948832d1f611..b64aa0111385c60c0a0390ecd272147b4c93ec66 100644 (file)
--- a/doc/src/sgml/xfunc.sgml
+++ b/doc/src/sgml/xfunc.sgml
@@ -1,5 +1,5 @@
  <!--
-$Header: /cvsroot/pgsql/doc/src/sgml/xfunc.sgml,v 1.68 2003/05/29 20:40:36 tgl Exp $
+$Header: /cvsroot/pgsql/doc/src/sgml/xfunc.sgml,v 1.69 2003/06/22 22:04:54 tgl Exp $
  -->
  
   <sect1 id="xfunc">
@@ -1442,11 +1442,10 @@ concat_text(PG_FUNCTION_ARGS)
        <listitem>
         <para>
          Always zero the bytes of your structures using
-        <function>memset</function> or <function>bzero</function>.
-        Several routines (such as the hash access method, hash joins,
-        and the sort algorithm) compute functions of the raw bits
-        contained in your structure.  Even if you initialize all
-        fields of your structure, there may be several bytes of
+   <function>memset</function>.  Without this, it's difficult to
+   support hash indexes or hash joins, as you must pick out only
+   the significant bits of your data structure to compute a hash.
+        Even if you initialize all fields of your structure, there may be
          alignment padding (holes in the structure) that may contain
          garbage values.
         </para>
diff --git a/doc/src/sgml/xoper.sgml b/doc/src/sgml/xoper.sgml

index 22d214623baefa829bcf73742203b732a2b953b1..a2705eb663601d35398000d1a303bd02a1e65120 100644 (file)
--- a/doc/src/sgml/xoper.sgml
+++ b/doc/src/sgml/xoper.sgml
@@ -1,5 +1,5 @@
  <!--
-$Header: /cvsroot/pgsql/doc/src/sgml/xoper.sgml,v 1.23 2003/04/10 01:22:45 petere Exp $
+$Header: /cvsroot/pgsql/doc/src/sgml/xoper.sgml,v 1.24 2003/06/22 22:04:54 tgl Exp $
  -->
  
   <sect1 id="xoper">
@@ -315,46 +315,34 @@ table1.column1 OP table2.column2
       same hash code.  If two values get put in different hash buckets, the
       join will never compare them at all, implicitly assuming that the
       result of the join operator must be false.  So it never makes sense
-     to specify <literal>HASHES</literal> for operators that do not represent equality.
+     to specify <literal>HASHES</literal> for operators that do not represent
+     equality.
      </para>
  
      <para>
-     In fact, logical equality is not good enough either; the operator
-     had better represent pure bitwise equality, because the hash
-     function will be computed on the memory representation of the
-     values regardless of what the bits mean.  For example, the
-     polygon operator <literal>~=</literal>, which checks whether two
-     polygons are the same, is not bitwise equality, because two
-     polygons can be considered the same even if their vertices are
-     specified in a different order.  What this means is that a join
-     using <literal>~=</literal> between polygon fields would yield
-     different results if implemented as a hash join than if
-     implemented another way, because a large fraction of the pairs
-     that should match will hash to different values and will never be
-     compared by the hash join.  But if the optimizer chooses to use a
-     different kind of join, all the pairs that the operator
-     <literal>~=</literal> says are the same will be found.  We don't
-     want that kind of inconsistency, so we don't mark the polygon
-     operator <literal>~=</literal> as hashable.
+     To be marked <literal>HASHES</literal>, the join operator must appear
+     in a hash index operator class.  This is not enforced when you create
+     the operator, since of course the referencing operator class couldn't
+     exist yet.  But attempts to use the operator in hash joins will fail
+     at runtime if no such operator class exists.  The system needs the
+     operator class to find the datatype-specific hash function for the
+     operator's input datatype.  Of course, you must also supply a suitable
+     hash function before you can create the operator class.
      </para>
  
      <para>
-     There are also machine-dependent ways in which a hash join might fail
-     to do the right thing.  For example, if your data type
-     is a structure in which there may be uninteresting pad bits, it's unsafe
-     to mark the equality operator <literal>HASHES</>.  (Unless you write
-     your other operators and functions to ensure that the unused bits are always zero, which is the recommended strategy.)
-     Another example is that the floating-point data types are unsafe for hash
-     joins.  On machines that meet the <acronym>IEEE</> floating-point standard, negative
-     zero and positive zero are different values (different bit patterns) but
-     they are defined to compare equal.  So, if the equality operator on floating-point data types were marked
-     <literal>HASHES</>, a negative zero and a positive zero would probably not be matched up
-     by a hash join, but they would be matched up by any other join process.
-    </para>
-
-    <para>
-     The bottom line is that you should probably only use <literal>HASHES</literal> for
-     equality operators that are (or could be) implemented by <function>memcmp()</function>.
+     Care should be exercised when preparing a hash function, because there
+     are machine-dependent ways in which it might fail to do the right thing.
+     For example, if your data type is a structure in which there may be
+     uninteresting pad bits, you can't simply pass the whole structure to
+     <function>hash_any</>.  (Unless you write your other operators and
+     functions to ensure that the unused bits are always zero, which is the
+     recommended strategy.)
+     Another example is that on machines that meet the <acronym>IEEE</>
+     floating-point standard, negative zero and positive zero are different
+     values (different bit patterns) but they are defined to compare equal.
+     If a float value might contain negative zero then extra steps are needed
+     to ensure it generates the same hash value as positive zero.
      </para>
  
      <note>
diff --git a/src/backend/access/hash/hashfunc.c b/src/backend/access/hash/hashfunc.c

index e6595de0727a2c0946f1bd25ac9ac1a0682ed3b8..a82b8b32d551c27f432250546f64a5e6a0a8c86b 100644 (file)
--- a/src/backend/access/hash/hashfunc.c
+++ b/src/backend/access/hash/hashfunc.c
@@ -8,7 +8,7 @@
   *
   *
   * IDENTIFICATION
- *   $Header: /cvsroot/pgsql/src/backend/access/hash/hashfunc.c,v 1.35 2002/09/04 20:31:09 momjian Exp $
+ *   $Header: /cvsroot/pgsql/src/backend/access/hash/hashfunc.c,v 1.36 2003/06/22 22:04:54 tgl Exp $
   *
   * NOTES
   *   These functions are stored in pg_amproc.  For each operator class
@@ -22,6 +22,7 @@
  #include "access/hash.h"
  
  
+/* Note: this is used for both "char" and boolean datatypes */
  Datum
  hashchar(PG_FUNCTION_ARGS)
  {
@@ -58,6 +59,14 @@ hashfloat4(PG_FUNCTION_ARGS)
  {
     float4      key = PG_GETARG_FLOAT4(0);
  
+   /*
+    * On IEEE-float machines, minus zero and zero have different bit patterns
+    * but should compare as equal.  We must ensure that they have the same
+    * hash value, which is most easily done this way:
+    */
+   if (key == (float4) 0)
+       PG_RETURN_UINT32(0);
+
     return hash_any((unsigned char *) &key, sizeof(key));
  }
  
@@ -66,6 +75,14 @@ hashfloat8(PG_FUNCTION_ARGS)
  {
     float8      key = PG_GETARG_FLOAT8(0);
  
+   /*
+    * On IEEE-float machines, minus zero and zero have different bit patterns
+    * but should compare as equal.  We must ensure that they have the same
+    * hash value, which is most easily done this way:
+    */
+   if (key == (float8) 0)
+       PG_RETURN_UINT32(0);
+
     return hash_any((unsigned char *) &key, sizeof(key));
  }
  
@@ -77,11 +94,6 @@ hashoidvector(PG_FUNCTION_ARGS)
     return hash_any((unsigned char *) key, INDEX_MAX_KEYS * sizeof(Oid));
  }
  
-/*
- * Note: hashint2vector currently can't be used as a user hash table
- * hash function, because it has no pg_proc entry. We only need it
- * for catcache indexing.
- */
  Datum
  hashint2vector(PG_FUNCTION_ARGS)
  {
@@ -102,6 +114,26 @@ hashname(PG_FUNCTION_ARGS)
     return hash_any((unsigned char *) key, keylen);
  }
  
+Datum
+hashtext(PG_FUNCTION_ARGS)
+{
+   text       *key = PG_GETARG_TEXT_P(0);
+   Datum       result;
+
+   /*
+    * Note: this is currently identical in behavior to hashvarlena,
+    * but it seems likely that we may need to do something different
+    * in non-C locales.  (See also hashbpchar, if so.)
+    */
+   result = hash_any((unsigned char *) VARDATA(key),
+                     VARSIZE(key) - VARHDRSZ);
+
+   /* Avoid leaking memory for toasted inputs */
+   PG_FREE_IF_COPY(key, 0);
+
+   return result;
+}
+
  /*
   * hashvarlena() can be used for any varlena datatype in which there are
   * no non-significant bits, ie, distinct bitpatterns never compare as equal.
diff --git a/src/backend/executor/execGrouping.c b/src/backend/executor/execGrouping.c

index 0d4d5ed20f38896769bc4a4ff5c6cdb73ddcc8ab..b41c75e926c9fff08c5e11a8c369b0e93c4cb8e5 100644 (file)
--- a/src/backend/executor/execGrouping.c
+++ b/src/backend/executor/execGrouping.c
@@ -8,7 +8,7 @@
   *
   *
   * IDENTIFICATION
- *   $Header: /cvsroot/pgsql/src/backend/executor/execGrouping.c,v 1.2 2003/01/12 04:03:34 tgl Exp $
+ *   $Header: /cvsroot/pgsql/src/backend/executor/execGrouping.c,v 1.3 2003/06/22 22:04:54 tgl Exp $
   *
   *-------------------------------------------------------------------------
   */
@@ -19,6 +19,8 @@
  #include "executor/executor.h"
  #include "parser/parse_oper.h"
  #include "utils/memutils.h"
+#include "utils/lsyscache.h"
+#include "utils/syscache.h"
  
  
  /*****************************************************************************
@@ -213,76 +215,46 @@ execTuplesMatchPrepare(TupleDesc tupdesc,
     return eqfunctions;
  }
  
-
-/*****************************************************************************
- *     Utility routines for hashing
- *****************************************************************************/
-
  /*
- * ComputeHashFunc
+ * execTuplesHashPrepare
+ *     Look up the equality and hashing functions needed for a TupleHashTable.
   *
- *     the hash function for hash joins (also used for hash aggregation)
- *
- *     XXX this probably ought to be replaced with datatype-specific
- *     hash functions, such as those already implemented for hash indexes.
+ * This is similar to execTuplesMatchPrepare, but we also need to find the
+ * hash functions associated with the equality operators.  *eqfunctions and
+ * *hashfunctions receive the palloc'd result arrays.
   */
-uint32
-ComputeHashFunc(Datum key, int typLen, bool byVal)
+void
+execTuplesHashPrepare(TupleDesc tupdesc,
+                     int numCols,
+                     AttrNumber *matchColIdx,
+                     FmgrInfo **eqfunctions,
+                     FmgrInfo **hashfunctions)
  {
-   unsigned char *k;
+   int         i;
  
-   if (byVal)
-   {
-       /*
-        * If it's a by-value data type, just hash the whole Datum value.
-        * This assumes that datatypes narrower than Datum are
-        * consistently padded (either zero-extended or sign-extended, but
-        * not random bits) to fill Datum; see the XXXGetDatum macros in
-        * postgres.h. NOTE: it would not work to do hash_any(&key, len)
-        * since this would get the wrong bytes on a big-endian machine.
-        */
-       k = (unsigned char *) &key;
-       typLen = sizeof(Datum);
-   }
-   else
+   *eqfunctions = (FmgrInfo *) palloc(numCols * sizeof(FmgrInfo));
+   *hashfunctions = (FmgrInfo *) palloc(numCols * sizeof(FmgrInfo));
+
+   for (i = 0; i < numCols; i++)
     {
-       if (typLen > 0)
-       {
-           /* fixed-width pass-by-reference type */
-           k = (unsigned char *) DatumGetPointer(key);
-       }
-       else if (typLen == -1)
-       {
-           /*
-            * It's a varlena type, so 'key' points to a "struct varlena".
-            * NOTE: VARSIZE returns the "real" data length plus the
-            * sizeof the "vl_len" attribute of varlena (the length
-            * information). 'key' points to the beginning of the varlena
-            * struct, so we have to use "VARDATA" to find the beginning
-            * of the "real" data.  Also, we have to be careful to detoast
-            * the datum if it's toasted.  (We don't worry about freeing
-            * the detoasted copy; that happens for free when the
-            * per-tuple memory context is reset in ExecHashGetBucket.)
-            */
-           struct varlena *vkey = PG_DETOAST_DATUM(key);
-
-           typLen = VARSIZE(vkey) - VARHDRSZ;
-           k = (unsigned char *) VARDATA(vkey);
-       }
-       else if (typLen == -2)
-       {
-           /* It's a null-terminated C string */
-           typLen = strlen(DatumGetCString(key)) + 1;
-           k = (unsigned char *) DatumGetPointer(key);
-       }
-       else
-       {
-           elog(ERROR, "ComputeHashFunc: Invalid typLen %d", typLen);
-           k = NULL;           /* keep compiler quiet */
-       }
+       AttrNumber  att = matchColIdx[i];
+       Oid         typid = tupdesc->attrs[att - 1]->atttypid;
+       Operator    optup;
+       Oid         eq_opr;
+       Oid         eq_function;
+       Oid         hash_function;
+
+       optup = equality_oper(typid, false);
+       eq_opr = oprid(optup);
+       eq_function = oprfuncid(optup);
+       ReleaseSysCache(optup);
+       hash_function = get_op_hash_function(eq_opr);
+       if (!OidIsValid(hash_function))
+           elog(ERROR, "Could not find hash function for hash operator %u",
+                eq_opr);
+       fmgr_info(eq_function, &(*eqfunctions)[i]);
+       fmgr_info(hash_function, &(*hashfunctions)[i]);
     }
-
-   return DatumGetUInt32(hash_any(k, typLen));
  }
  
  
@@ -299,19 +271,21 @@ ComputeHashFunc(Datum key, int typLen, bool byVal)
   *
   * numCols, keyColIdx: identify the tuple fields to use as lookup key
   * eqfunctions: equality comparison functions to use
+ * hashfunctions: datatype-specific hashing functions to use
   * nbuckets: number of buckets to make
   * entrysize: size of each entry (at least sizeof(TupleHashEntryData))
   * tablecxt: memory context in which to store table and table entries
   * tempcxt: short-lived context for evaluation hash and comparison functions
   *
- * The eqfunctions array may be made with execTuplesMatchPrepare().
+ * The function arrays may be made with execTuplesHashPrepare().
   *
- * Note that keyColIdx and eqfunctions must be allocated in storage that
- * will live as long as the hashtable does.
+ * Note that keyColIdx, eqfunctions, and hashfunctions must be allocated in
+ * storage that will live as long as the hashtable does.
   */
  TupleHashTable
  BuildTupleHashTable(int numCols, AttrNumber *keyColIdx,
                     FmgrInfo *eqfunctions,
+                   FmgrInfo *hashfunctions,
                     int nbuckets, Size entrysize,
                     MemoryContext tablecxt, MemoryContext tempcxt)
  {
@@ -328,6 +302,7 @@ BuildTupleHashTable(int numCols, AttrNumber *keyColIdx,
     hashtable->numCols = numCols;
     hashtable->keyColIdx = keyColIdx;
     hashtable->eqfunctions = eqfunctions;
+   hashtable->hashfunctions = hashfunctions;
     hashtable->tablecxt = tablecxt;
     hashtable->tempcxt = tempcxt;
     hashtable->entrysize = entrysize;
@@ -375,11 +350,15 @@ LookupTupleHashEntry(TupleHashTable hashtable, TupleTableSlot *slot,
         hashkey = (hashkey << 1) | ((hashkey & 0x80000000) ? 1 : 0);
  
         attr = heap_getattr(tuple, att, tupdesc, &isNull);
-       if (isNull)
-           continue;           /* treat nulls as having hash key 0 */
-       hashkey ^= ComputeHashFunc(attr,
-                                  (int) tupdesc->attrs[att - 1]->attlen,
-                                  tupdesc->attrs[att - 1]->attbyval);
+
+       if (!isNull)            /* treat nulls as having hash key 0 */
+       {
+           uint32      hkey;
+
+           hkey = DatumGetUInt32(FunctionCall1(&hashtable->hashfunctions[i],
+                                               attr));
+           hashkey ^= hkey;
+       }
     }
     bucketno = hashkey % (uint32) hashtable->nbuckets;
  
diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c

index f2499cb4e5ec4152837801d2918ed937afa3d8d4..d0dd6b31c997b9c8fb7f462a7e20894dd723d60b 100644 (file)
--- a/src/backend/executor/nodeAgg.c
+++ b/src/backend/executor/nodeAgg.c
@@ -45,7 +45,7 @@
   * Portions Copyright (c) 1994, Regents of the University of California
   *
   * IDENTIFICATION
- *   $Header: /cvsroot/pgsql/src/backend/executor/nodeAgg.c,v 1.106 2003/06/06 15:04:01 tgl Exp $
+ *   $Header: /cvsroot/pgsql/src/backend/executor/nodeAgg.c,v 1.107 2003/06/22 22:04:54 tgl Exp $
   *
   *-------------------------------------------------------------------------
   */
@@ -582,6 +582,7 @@ build_hash_table(AggState *aggstate)
     aggstate->hashtable = BuildTupleHashTable(node->numCols,
                                               node->grpColIdx,
                                               aggstate->eqfunctions,
+                                             aggstate->hashfunctions,
                                               node->numGroups,
                                               entrysize,
                                               aggstate->aggcontext,
@@ -1035,6 +1036,7 @@ ExecInitAgg(Agg *node, EState *estate)
     aggstate->aggs = NIL;
     aggstate->numaggs = 0;
     aggstate->eqfunctions = NULL;
+   aggstate->hashfunctions = NULL;
     aggstate->peragg = NULL;
     aggstate->agg_done = false;
     aggstate->pergroup = NULL;
@@ -1123,14 +1125,23 @@ ExecInitAgg(Agg *node, EState *estate)
     }
  
     /*
-    * If we are grouping, precompute fmgr lookup data for inner loop
+    * If we are grouping, precompute fmgr lookup data for inner loop.
+    * We need both equality and hashing functions to do it by hashing,
+    * but only equality if not hashing.
      */
     if (node->numCols > 0)
     {
-       aggstate->eqfunctions =
-           execTuplesMatchPrepare(ExecGetScanType(&aggstate->ss),
-                                  node->numCols,
-                                  node->grpColIdx);
+       if (node->aggstrategy == AGG_HASHED)
+           execTuplesHashPrepare(ExecGetScanType(&aggstate->ss),
+                                 node->numCols,
+                                 node->grpColIdx,
+                                 &aggstate->eqfunctions,
+                                 &aggstate->hashfunctions);
+       else
+           aggstate->eqfunctions =
+               execTuplesMatchPrepare(ExecGetScanType(&aggstate->ss),
+                                      node->numCols,
+                                      node->grpColIdx);
     }
  
     /*
diff --git a/src/backend/executor/nodeHash.c b/src/backend/executor/nodeHash.c

index b338c8961e2a20b43cd35a9dfc47b0befd86ac70..f00cc28684d0ddd713f3df538491149bb33a7247 100644 (file)
--- a/src/backend/executor/nodeHash.c
+++ b/src/backend/executor/nodeHash.c
@@ -8,7 +8,7 @@
   *
   *
   * IDENTIFICATION
- *   $Header: /cvsroot/pgsql/src/backend/executor/nodeHash.c,v 1.75 2003/03/27 16:51:27 momjian Exp $
+ *   $Header: /cvsroot/pgsql/src/backend/executor/nodeHash.c,v 1.76 2003/06/22 22:04:54 tgl Exp $
   *
   *-------------------------------------------------------------------------
   */
@@ -192,7 +192,7 @@ ExecEndHash(HashState *node)
   * ----------------------------------------------------------------
   */
  HashJoinTable
-ExecHashTableCreate(Hash *node)
+ExecHashTableCreate(Hash *node, List *hashOperators)
  {
     HashJoinTable hashtable;
     Plan       *outerNode;
@@ -201,7 +201,7 @@ ExecHashTableCreate(Hash *node)
     int         nbatch;
     int         nkeys;
     int         i;
-   List       *hk;
+   List       *ho;
     MemoryContext oldcxt;
  
     /*
@@ -237,17 +237,20 @@ ExecHashTableCreate(Hash *node)
     hashtable->outerBatchSize = NULL;
  
     /*
-    * Get info about the datatypes of the hash keys.
+    * Get info about the hash functions to be used for each hash key.
      */
-   nkeys = length(node->hashkeys);
-   hashtable->typLens = (int16 *) palloc(nkeys * sizeof(int16));
-   hashtable->typByVals = (bool *) palloc(nkeys * sizeof(bool));
+   nkeys = length(hashOperators);
+   hashtable->hashfunctions = (FmgrInfo *) palloc(nkeys * sizeof(FmgrInfo));
     i = 0;
-   foreach(hk, node->hashkeys)
+   foreach(ho, hashOperators)
     {
-       get_typlenbyval(exprType(lfirst(hk)),
-                       &hashtable->typLens[i],
-                       &hashtable->typByVals[i]);
+       Oid     hashfn;
+
+       hashfn = get_op_hash_function(lfirsto(ho));
+       if (!OidIsValid(hashfn))
+           elog(ERROR, "Could not find hash function for hash operator %u",
+                lfirsto(ho));
+       fmgr_info(hashfn, &hashtable->hashfunctions[i]);
         i++;
     }
  
@@ -520,7 +523,7 @@ ExecHashGetBucket(HashJoinTable hashtable,
  
     /*
      * We reset the eval context each time to reclaim any memory leaked in
-    * the hashkey expressions or ComputeHashFunc itself.
+    * the hashkey expressions.
      */
     ResetExprContext(econtext);
  
@@ -545,9 +548,11 @@ ExecHashGetBucket(HashJoinTable hashtable,
          */
         if (!isNull)            /* treat nulls as having hash key 0 */
         {
-           hashkey ^= ComputeHashFunc(keyval,
-                                      (int) hashtable->typLens[i],
-                                      hashtable->typByVals[i]);
+           uint32      hkey;
+
+           hkey = DatumGetUInt32(FunctionCall1(&hashtable->hashfunctions[i],
+                                               keyval));
+           hashkey ^= hkey;
         }
  
         i++;
diff --git a/src/backend/executor/nodeHashjoin.c b/src/backend/executor/nodeHashjoin.c

index 17585b2f0fc4e4269faabe59c74d6368b1a10a80..9a0071f018069b925930ef4892b6dc7a8117e6ef 100644 (file)
--- a/src/backend/executor/nodeHashjoin.c
+++ b/src/backend/executor/nodeHashjoin.c
@@ -8,7 +8,7 @@
   *
   *
   * IDENTIFICATION
- *   $Header: /cvsroot/pgsql/src/backend/executor/nodeHashjoin.c,v 1.51 2003/05/30 20:23:10 tgl Exp $
+ *   $Header: /cvsroot/pgsql/src/backend/executor/nodeHashjoin.c,v 1.52 2003/06/22 22:04:54 tgl Exp $
   *
   *-------------------------------------------------------------------------
   */
@@ -117,7 +117,8 @@ ExecHashJoin(HashJoinState *node)
          * create the hash table
          */
         Assert(hashtable == NULL);
-       hashtable = ExecHashTableCreate((Hash *) hashNode->ps.plan);
+       hashtable = ExecHashTableCreate((Hash *) hashNode->ps.plan,
+                                       node->hj_HashOperators);
         node->hj_HashTable = hashtable;
  
         /*
@@ -305,6 +306,7 @@ ExecInitHashJoin(HashJoin *node, EState *estate)
     Plan       *outerNode;
     Hash       *hashNode;
     List       *hclauses;
+   List       *hoperators;
     List       *hcl;
  
     /*
@@ -406,8 +408,9 @@ ExecInitHashJoin(HashJoin *node, EState *estate)
  
     /*
      * The planner already made a list of the inner hashkeys for us,
-    * but we also need a list of the outer hashkeys.  Each list of
-    * exprs must then be prepared for execution.
+    * but we also need a list of the outer hashkeys, as well as a list
+    * of the hash operator OIDs.  Both lists of exprs must then be prepared
+    * for execution.
      */
     hjstate->hj_InnerHashKeys = (List *)
         ExecInitExpr((Expr *) hashNode->hashkeys,
@@ -416,13 +419,19 @@ ExecInitHashJoin(HashJoin *node, EState *estate)
         hjstate->hj_InnerHashKeys;
  
     hclauses = NIL;
+   hoperators = NIL;
     foreach(hcl, node->hashclauses)
     {
-       hclauses = lappend(hclauses, get_leftop(lfirst(hcl)));
+       OpExpr     *hclause = (OpExpr *) lfirst(hcl);
+
+       Assert(IsA(hclause, OpExpr));
+       hclauses = lappend(hclauses, get_leftop((Expr *) hclause));
+       hoperators = lappendo(hoperators, hclause->opno);
     }
     hjstate->hj_OuterHashKeys = (List *)
         ExecInitExpr((Expr *) hclauses,
                      (PlanState *) hjstate);
+   hjstate->hj_HashOperators = hoperators;
  
     hjstate->js.ps.ps_OuterTupleSlot = NULL;
     hjstate->js.ps.ps_TupFromTlist = false;
diff --git a/src/backend/executor/nodeSubplan.c b/src/backend/executor/nodeSubplan.c

index ff5d03faf8c6693f3a6a34576905891b51031c50..82502c985e8e659c46ef01521f80647549958b7f 100644 (file)
--- a/src/backend/executor/nodeSubplan.c
+++ b/src/backend/executor/nodeSubplan.c
@@ -7,7 +7,7 @@
   * Portions Copyright (c) 1994, Regents of the University of California
   *
   * IDENTIFICATION
- *   $Header: /cvsroot/pgsql/src/backend/executor/nodeSubplan.c,v 1.46 2003/06/06 15:04:01 tgl Exp $
+ *   $Header: /cvsroot/pgsql/src/backend/executor/nodeSubplan.c,v 1.47 2003/06/22 22:04:54 tgl Exp $
   *
   *-------------------------------------------------------------------------
   */
@@ -519,6 +519,7 @@ buildSubPlanHash(SubPlanState *node)
     node->hashtable = BuildTupleHashTable(ncols,
                                           node->keyColIdx,
                                           node->eqfunctions,
+                                         node->hashfunctions,
                                           nbuckets,
                                           sizeof(TupleHashEntryData),
                                           node->tablecxt,
@@ -537,6 +538,7 @@ buildSubPlanHash(SubPlanState *node)
         node->hashnulls = BuildTupleHashTable(ncols,
                                               node->keyColIdx,
                                               node->eqfunctions,
+                                             node->hashfunctions,
                                               nbuckets,
                                               sizeof(TupleHashEntryData),
                                               node->tablecxt,
@@ -700,6 +702,7 @@ ExecInitSubPlan(SubPlanState *node, EState *estate)
     node->innerecontext = NULL;
     node->keyColIdx = NULL;
     node->eqfunctions = NULL;
+   node->hashfunctions = NULL;
  
     /*
      * create an EState for the subplan
@@ -797,11 +800,12 @@ ExecInitSubPlan(SubPlanState *node, EState *estate)
          * ExecTypeFromTL).
          *
          * We also extract the combining operators themselves to initialize
-        * the equality functions for the hash tables.
+        * the equality and hashing functions for the hash tables.
          */
         lefttlist = righttlist = NIL;
         leftptlist = rightptlist = NIL;
         node->eqfunctions = (FmgrInfo *) palloc(ncols * sizeof(FmgrInfo));
+       node->hashfunctions = (FmgrInfo *) palloc(ncols * sizeof(FmgrInfo));
         i = 1;
         foreach(lexpr, node->exprs)
         {
@@ -811,6 +815,7 @@ ExecInitSubPlan(SubPlanState *node, EState *estate)
             Expr       *expr;
             TargetEntry *tle;
             GenericExprState *tlestate;
+           Oid         hashfn;
  
             Assert(IsA(fstate, FuncExprState));
             Assert(IsA(opexpr, OpExpr));
@@ -850,6 +855,13 @@ ExecInitSubPlan(SubPlanState *node, EState *estate)
             fmgr_info(opexpr->opfuncid, &node->eqfunctions[i-1]);
             node->eqfunctions[i-1].fn_expr = (Node *) opexpr;
  
+           /* Lookup the associated hash function */
+           hashfn = get_op_hash_function(opexpr->opno);
+           if (!OidIsValid(hashfn))
+               elog(ERROR, "Could not find hash function for hash operator %u",
+                    opexpr->opno);
+           fmgr_info(hashfn, &node->hashfunctions[i-1]);
+
             i++;
         }
  
diff --git a/src/backend/utils/adt/varchar.c b/src/backend/utils/adt/varchar.c

index 5085c6025c2aef05112531a0db10ad45a0500e39..6e99c15e36952a293e3730e66546066b179a237a 100644 (file)
--- a/src/backend/utils/adt/varchar.c
+++ b/src/backend/utils/adt/varchar.c
@@ -8,7 +8,7 @@
   *
   *
   * IDENTIFICATION
- *   $Header: /cvsroot/pgsql/src/backend/utils/adt/varchar.c,v 1.97 2003/05/26 00:11:27 tgl Exp $
+ *   $Header: /cvsroot/pgsql/src/backend/utils/adt/varchar.c,v 1.98 2003/06/22 22:04:54 tgl Exp $
   *
   *-------------------------------------------------------------------------
   */
@@ -704,6 +704,8 @@ bpcharcmp(PG_FUNCTION_ARGS)
  /*
   * bpchar needs a specialized hash function because we want to ignore
   * trailing blanks in comparisons.
+ *
+ * XXX is there any need for locale-specific behavior here?
   */
  Datum
  hashbpchar(PG_FUNCTION_ARGS)
diff --git a/src/backend/utils/cache/catcache.c b/src/backend/utils/cache/catcache.c

index 65d716f6ba5bc2c92e2d96c733efcd632188d20e..8c0df3dfc2ca8a835ca930615bbbe18e31885c8f 100644 (file)
--- a/src/backend/utils/cache/catcache.c
+++ b/src/backend/utils/cache/catcache.c
@@ -8,7 +8,7 @@
   *
   *
   * IDENTIFICATION
- *   $Header: /cvsroot/pgsql/src/backend/utils/cache/catcache.c,v 1.103 2003/05/27 17:49:46 momjian Exp $
+ *   $Header: /cvsroot/pgsql/src/backend/utils/cache/catcache.c,v 1.104 2003/06/22 22:04:54 tgl Exp $
   *
   *-------------------------------------------------------------------------
   */
@@ -81,22 +81,6 @@
  /* Cache management header --- pointer is NULL until created */
  static CatCacheHeader *CacheHdr = NULL;
  
-/*
- *     EQPROC is used in CatalogCacheInitializeCache to find the equality
- *     functions for system types that are used as cache key fields.
- *     See also GetCCHashFunc, which should support the same set of types.
- *
- *     XXX this should be replaced by catalog lookups,
- *     but that seems to pose considerable risk of circularity...
- */
-static const Oid eqproc[] = {
-   F_BOOLEQ, InvalidOid, F_CHAREQ, F_NAMEEQ, InvalidOid,
-   F_INT2EQ, F_INT2VECTOREQ, F_INT4EQ, F_OIDEQ, F_TEXTEQ,
-   F_OIDEQ, InvalidOid, InvalidOid, InvalidOid, F_OIDVECTOREQ
-};
-
-#define EQPROC(SYSTEMTYPEOID)  eqproc[(SYSTEMTYPEOID)-BOOLOID]
-
  
  static uint32 CatalogCacheComputeHashValue(CatCache *cache, int nkeys,
                              ScanKey cur_skey);
@@ -119,24 +103,46 @@ static HeapTuple build_dummy_tuple(CatCache *cache, int nkeys, ScanKey skeys);
   *                 internal support functions
   */
  
-static PGFunction
-GetCCHashFunc(Oid keytype)
+/*
+ * Look up the hash and equality functions for system types that are used
+ * as cache key fields.
+ *
+ * XXX this should be replaced by catalog lookups,
+ * but that seems to pose considerable risk of circularity...
+ */
+static void
+GetCCHashEqFuncs(Oid keytype, PGFunction *hashfunc, RegProcedure *eqfunc)
  {
     switch (keytype)
     {
         case BOOLOID:
+           *hashfunc = hashchar;
+           *eqfunc = F_BOOLEQ;
+           break;
         case CHAROID:
-           return hashchar;
+           *hashfunc = hashchar;
+           *eqfunc = F_CHAREQ;
+           break;
         case NAMEOID:
-           return hashname;
+           *hashfunc = hashname;
+           *eqfunc = F_NAMEEQ;
+           break;
         case INT2OID:
-           return hashint2;
+           *hashfunc = hashint2;
+           *eqfunc = F_INT2EQ;
+           break;
         case INT2VECTOROID:
-           return hashint2vector;
+           *hashfunc = hashint2vector;
+           *eqfunc = F_INT2VECTOREQ;
+           break;
         case INT4OID:
-           return hashint4;
+           *hashfunc = hashint4;
+           *eqfunc = F_INT4EQ;
+           break;
         case TEXTOID:
-           return hashvarlena;
+           *hashfunc = hashtext;
+           *eqfunc = F_TEXTEQ;
+           break;
         case OIDOID:
         case REGPROCOID:
         case REGPROCEDUREOID:
@@ -144,13 +150,17 @@ GetCCHashFunc(Oid keytype)
         case REGOPERATOROID:
         case REGCLASSOID:
         case REGTYPEOID:
-           return hashoid;
+           *hashfunc = hashoid;
+           *eqfunc = F_OIDEQ;
+           break;
         case OIDVECTOROID:
-           return hashoidvector;
+           *hashfunc = hashoidvector;
+           *eqfunc = F_OIDVECTOREQ;
+           break;
         default:
-           elog(FATAL, "GetCCHashFunc: type %u unsupported as catcache key",
+           elog(FATAL, "GetCCHashEqFuncs: type %u unsupported as catcache key",
                  keytype);
-           return (PGFunction) NULL;
+           break;
     }
  }
  
@@ -941,16 +951,16 @@ CatalogCacheInitializeCache(CatCache *cache)
             keytype = OIDOID;
         }
  
-       cache->cc_hashfunc[i] = GetCCHashFunc(keytype);
+       GetCCHashEqFuncs(keytype,
+                        &cache->cc_hashfunc[i],
+                        &cache->cc_skey[i].sk_procedure);
  
         cache->cc_isname[i] = (keytype == NAMEOID);
  
         /*
-        * If GetCCHashFunc liked the type, safe to index into eqproc[]
+        * Do equality-function lookup (we assume this won't need a catalog
+        * lookup for any supported type)
          */
-       cache->cc_skey[i].sk_procedure = EQPROC(keytype);
-
-       /* Do function lookup */
         fmgr_info_cxt(cache->cc_skey[i].sk_procedure,
                       &cache->cc_skey[i].sk_func,
                       CacheMemoryContext);
diff --git a/src/backend/utils/cache/lsyscache.c b/src/backend/utils/cache/lsyscache.c

index fcd9dc2f59bbd19528a88da6979f03b57d4d81f1..095c5e6a8aad5732476f335ed858a4d4df50408f 100644 (file)
--- a/src/backend/utils/cache/lsyscache.c
+++ b/src/backend/utils/cache/lsyscache.c
@@ -7,7 +7,7 @@
   * Portions Copyright (c) 1994, Regents of the University of California
   *
   * IDENTIFICATION
- *   $Header: /cvsroot/pgsql/src/backend/utils/cache/lsyscache.c,v 1.95 2003/05/26 00:11:27 tgl Exp $
+ *   $Header: /cvsroot/pgsql/src/backend/utils/cache/lsyscache.c,v 1.96 2003/06/22 22:04:54 tgl Exp $
   *
   * NOTES
   *   Eventually, the index information should go through here, too.
@@ -16,8 +16,10 @@
  #include "postgres.h"
  #include "miscadmin.h"
  
+#include "access/hash.h"
  #include "access/tupmacs.h"
  #include "catalog/pg_amop.h"
+#include "catalog/pg_amproc.h"
  #include "catalog/pg_namespace.h"
  #include "catalog/pg_opclass.h"
  #include "catalog/pg_operator.h"
@@ -28,6 +30,7 @@
  #include "nodes/makefuncs.h"
  #include "utils/array.h"
  #include "utils/builtins.h"
+#include "utils/catcache.h"
  #include "utils/datum.h"
  #include "utils/lsyscache.h"
  #include "utils/syscache.h"
@@ -106,6 +109,72 @@ get_opclass_member(Oid opclass, int16 strategy)
     return result;
  }
  
+/*
+ * get_op_hash_function
+ *     Get the OID of the datatype-specific hash function associated with
+ *     a hashable equality operator.
+ *
+ * Returns InvalidOid if no hash function can be found.  (This indicates
+ * that the operator should not have been marked oprcanhash.)
+ */
+Oid
+get_op_hash_function(Oid opno)
+{
+   CatCList   *catlist;
+   int         i;
+   HeapTuple   tuple;
+   Oid         opclass = InvalidOid;
+
+   /*
+    * Search pg_amop to see if the target operator is registered as the "="
+    * operator of any hash opclass.  If the operator is registered in
+    * multiple opclasses, assume we can use the associated hash function
+    * from any one.
+    */
+   catlist = SearchSysCacheList(AMOPOPID, 1,
+                                ObjectIdGetDatum(opno),
+                                0, 0, 0);
+
+   for (i = 0; i < catlist->n_members; i++)
+   {
+       Form_pg_amop aform;
+
+       tuple = &catlist->members[i]->tuple;
+       aform = (Form_pg_amop) GETSTRUCT(tuple);
+
+       if (aform->amopstrategy == HTEqualStrategyNumber &&
+           opclass_is_hash(aform->amopclaid))
+       {
+           opclass = aform->amopclaid;
+           break;
+       }
+   }
+
+   ReleaseSysCacheList(catlist);
+
+   if (OidIsValid(opclass))
+   {
+       /* Found a suitable opclass, get its hash support function */
+       tuple = SearchSysCache(AMPROCNUM,
+                              ObjectIdGetDatum(opclass),
+                              Int16GetDatum(HASHPROC),
+                              0, 0);
+       if (HeapTupleIsValid(tuple))
+       {
+           Form_pg_amproc aform = (Form_pg_amproc) GETSTRUCT(tuple);
+           RegProcedure result;
+
+           result = aform->amproc;
+           ReleaseSysCache(tuple);
+           Assert(RegProcedureIsValid(result));
+           return result;
+       }
+   }
+
+   /* Didn't find a match... */
+   return InvalidOid;
+}
+
  
  /*             ---------- ATTRIBUTE CACHES ----------                   */
  
@@ -284,6 +353,31 @@ opclass_is_btree(Oid opclass)
     return result;
  }
  
+/*
+ * opclass_is_hash
+ *
+ *     Returns TRUE iff the specified opclass is associated with the
+ *     hash index access method.
+ */
+bool
+opclass_is_hash(Oid opclass)
+{
+   HeapTuple   tp;
+   Form_pg_opclass cla_tup;
+   bool        result;
+
+   tp = SearchSysCache(CLAOID,
+                       ObjectIdGetDatum(opclass),
+                       0, 0, 0);
+   if (!HeapTupleIsValid(tp))
+       elog(ERROR, "cache lookup failed for opclass %u", opclass);
+   cla_tup = (Form_pg_opclass) GETSTRUCT(tp);
+
+   result = (cla_tup->opcamid == HASH_AM_OID);
+   ReleaseSysCache(tp);
+   return result;
+}
+
  /*             ---------- OPERATOR CACHE ----------                     */
  
  /*
diff --git a/src/include/access/hash.h b/src/include/access/hash.h

index 1fadd8c9e37187f62e66ccc479e05025dd03b186..5834f9218896a6ee5b6ce2bfd948e839b974b2c5 100644 (file)
--- a/src/include/access/hash.h
+++ b/src/include/access/hash.h
@@ -7,7 +7,7 @@
   * Portions Copyright (c) 1996-2002, PostgreSQL Global Development Group
   * Portions Copyright (c) 1994, Regents of the University of California
   *
- * $Id: hash.h,v 1.47 2002/06/20 20:29:42 momjian Exp $
+ * $Id: hash.h,v 1.48 2003/06/22 22:04:54 tgl Exp $
   *
   * NOTES
   *     modeled after Margo Seltzer's hash implementation for unix.
@@ -251,9 +251,11 @@ extern Datum hashbulkdelete(PG_FUNCTION_ARGS);
  /*
   * Datatype-specific hash functions in hashfunc.c.
   *
+ * These support both hash indexes and hash joins.
+ *
   * NOTE: some of these are also used by catcache operations, without
   * any direct connection to hash indexes.  Also, the common hash_any
- * routine is also used by dynahash tables and hash joins.
+ * routine is also used by dynahash tables.
   */
  extern Datum hashchar(PG_FUNCTION_ARGS);
  extern Datum hashint2(PG_FUNCTION_ARGS);
@@ -265,6 +267,7 @@ extern Datum hashfloat8(PG_FUNCTION_ARGS);
  extern Datum hashoidvector(PG_FUNCTION_ARGS);
  extern Datum hashint2vector(PG_FUNCTION_ARGS);
  extern Datum hashname(PG_FUNCTION_ARGS);
+extern Datum hashtext(PG_FUNCTION_ARGS);
  extern Datum hashvarlena(PG_FUNCTION_ARGS);
  extern Datum hash_any(register const unsigned char *k, register int keylen);
  
diff --git a/src/include/catalog/catversion.h b/src/include/catalog/catversion.h

index 209bd5ff2424f1d3f6240fb9b62272c956d4df4d..f6c3855bb410e6fd8c2690c40ac2f0b984507cf1 100644 (file)
--- a/src/include/catalog/catversion.h
+++ b/src/include/catalog/catversion.h
@@ -37,7 +37,7 @@
   * Portions Copyright (c) 1996-2002, PostgreSQL Global Development Group
   * Portions Copyright (c) 1994, Regents of the University of California
   *
- * $Id: catversion.h,v 1.198 2003/06/06 15:04:02 tgl Exp $
+ * $Id: catversion.h,v 1.199 2003/06/22 22:04:55 tgl Exp $
   *
   *-------------------------------------------------------------------------
   */
@@ -53,6 +53,6 @@
   */
  
  /*                         yyyymmddN */
-#define CATALOG_VERSION_NO 200306051
+#define CATALOG_VERSION_NO 200306221
  
  #endif
diff --git a/src/include/catalog/pg_am.h b/src/include/catalog/pg_am.h

index 0807fb1a398cfad682e710253d3a74ef01a3727c..5186d9ef5eb26bc7cd8c96192d86febb35a1a243 100644 (file)
--- a/src/include/catalog/pg_am.h
+++ b/src/include/catalog/pg_am.h
@@ -8,7 +8,7 @@
   * Portions Copyright (c) 1996-2002, PostgreSQL Global Development Group
   * Portions Copyright (c) 1994, Regents of the University of California
   *
- * $Id: pg_am.h,v 1.25 2003/03/10 22:28:19 tgl Exp $
+ * $Id: pg_am.h,v 1.26 2003/06/22 22:04:55 tgl Exp $
   *
   * NOTES
   *     the genbki.sh script reads this file and generates .bki
@@ -107,6 +107,7 @@ DESCR("b-tree index access method");
  #define BTREE_AM_OID 403
  DATA(insert OID = 405 (  hash  PGUID   1 1 0 f f f t hashgettuple hashinsert hashbeginscan hashrescan hashendscan hashmarkpos hashrestrpos hashbuild hashbulkdelete - hashcostestimate ));
  DESCR("hash index access method");
+#define HASH_AM_OID 405
  DATA(insert OID = 783 (  gist  PGUID 100 7 0 f t f f gistgettuple gistinsert gistbeginscan gistrescan gistendscan gistmarkpos gistrestrpos gistbuild gistbulkdelete - gistcostestimate ));
  DESCR("GiST index access method");
  #define GIST_AM_OID 783
diff --git a/src/include/catalog/pg_amop.h b/src/include/catalog/pg_amop.h

index dbff38b3c392bc3fda7db669da54a32773979341..e2cb7560e175e678610421420d70cc12a95e57c5 100644 (file)
--- a/src/include/catalog/pg_amop.h
+++ b/src/include/catalog/pg_amop.h
@@ -16,7 +16,7 @@
   * Portions Copyright (c) 1996-2002, PostgreSQL Global Development Group
   * Portions Copyright (c) 1994, Regents of the University of California
   *
- * $Id: pg_amop.h,v 1.49 2003/05/26 00:11:27 tgl Exp $
+ * $Id: pg_amop.h,v 1.50 2003/06/22 22:04:55 tgl Exp $
   *
   * NOTES
   *  the genbki.sh script reads this file and generates .bki
@@ -465,5 +465,27 @@ DATA(insert (  2001 1 f 1550 ));
  DATA(insert (  2004 1 f   98 ));
  /* timestamp_ops */
  DATA(insert (  2040 1 f 2060 ));
+/* bool_ops */
+DATA(insert (  2222 1 f   91 ));
+/* bytea_ops */
+DATA(insert (  2223 1 f 1955 ));
+/* int2vector_ops */
+DATA(insert (  2224 1 f  386 ));
+/* xid_ops */
+DATA(insert (  2225 1 f  352 ));
+/* cid_ops */
+DATA(insert (  2226 1 f  385 ));
+/* abstime_ops */
+DATA(insert (  2227 1 f  560 ));
+/* reltime_ops */
+DATA(insert (  2228 1 f  566 ));
+/* text_pattern_ops */
+DATA(insert (  2229 1 f 2316 ));
+/* varchar_pattern_ops */
+DATA(insert (  2230 1 f 2316 ));
+/* bpchar_pattern_ops */
+DATA(insert (  2231 1 f 2328 ));
+/* name_pattern_ops */
+DATA(insert (  2232 1 f 2334 ));
  
  #endif   /* PG_AMOP_H */
diff --git a/src/include/catalog/pg_amproc.h b/src/include/catalog/pg_amproc.h

index 0048d000cdf90f18bd97c75234c9f5f50b4d50be..c2c37c136132ddc4d0f3b0d8556070a9cc3bd7df 100644 (file)
--- a/src/include/catalog/pg_amproc.h
+++ b/src/include/catalog/pg_amproc.h
@@ -14,7 +14,7 @@
   * Portions Copyright (c) 1996-2002, PostgreSQL Global Development Group
   * Portions Copyright (c) 1994, Regents of the University of California
   *
- * $Id: pg_amproc.h,v 1.37 2003/05/26 00:11:27 tgl Exp $
+ * $Id: pg_amproc.h,v 1.38 2003/06/22 22:04:55 tgl Exp $
   *
   * NOTES
   *   the genbki.sh script reads this file and generates .bki
@@ -127,11 +127,22 @@ DATA(insert ( 1985 1  399 ));
  DATA(insert (  1987 1  455 ));
  DATA(insert (  1990 1  453 ));
  DATA(insert (  1992 1  457 ));
-DATA(insert (  1995 1  456 ));
+DATA(insert (  1995 1  400 ));
  DATA(insert (  1997 1  452 ));
  DATA(insert (  1999 1  452 ));
  DATA(insert (  2001 1 1696 ));
-DATA(insert (  2004 1  456 ));
+DATA(insert (  2004 1  400 ));
  DATA(insert (  2040 1  452 ));
+DATA(insert (  2222 1  454 ));
+DATA(insert (  2223 1  456 ));
+DATA(insert (  2224 1  398 ));
+DATA(insert (  2225 1  450 ));
+DATA(insert (  2226 1  450 ));
+DATA(insert (  2227 1  450 ));
+DATA(insert (  2228 1  450 ));
+DATA(insert (  2229 1  456 ));
+DATA(insert (  2230 1  456 ));
+DATA(insert (  2231 1  456 ));
+DATA(insert (  2232 1  455 ));
  
  #endif   /* PG_AMPROC_H */
diff --git a/src/include/catalog/pg_opclass.h b/src/include/catalog/pg_opclass.h

index f9de8fa28c89bbce600b6bee153d919c95f95e35..1c8844b2fece39c1b08f5ae3a945c3d01a5844a3 100644 (file)
--- a/src/include/catalog/pg_opclass.h
+++ b/src/include/catalog/pg_opclass.h
@@ -26,7 +26,7 @@
   * Portions Copyright (c) 1996-2002, PostgreSQL Global Development Group
   * Portions Copyright (c) 1994, Regents of the University of California
   *
- * $Id: pg_opclass.h,v 1.50 2003/05/28 16:04:00 tgl Exp $
+ * $Id: pg_opclass.h,v 1.51 2003/06/22 22:04:55 tgl Exp $
   *
   * NOTES
   *   the genbki.sh script reads this file and generates .bki
@@ -155,5 +155,16 @@ DATA(insert OID = 2097 (   403     bpchar_pattern_ops  PGNSP PGUID 1042 f 0 ));
  #define BPCHAR_PATTERN_BTREE_OPS_OID 2097
  DATA(insert OID = 2098 (   403     name_pattern_ops    PGNSP PGUID   19 f 0 ));
  #define NAME_PATTERN_BTREE_OPS_OID 2098
+DATA(insert OID = 2222 (   405     bool_ops        PGNSP PGUID   16 t 0 ));
+DATA(insert OID = 2223 (   405     bytea_ops       PGNSP PGUID   17 t 0 ));
+DATA(insert OID = 2224 (   405     int2vector_ops  PGNSP PGUID   22 t 0 ));
+DATA(insert OID = 2225 (   405     xid_ops         PGNSP PGUID   28 t 0 ));
+DATA(insert OID = 2226 (   405     cid_ops         PGNSP PGUID   29 t 0 ));
+DATA(insert OID = 2227 (   405     abstime_ops     PGNSP PGUID  702 t 0 ));
+DATA(insert OID = 2228 (   405     reltime_ops     PGNSP PGUID  703 t 0 ));
+DATA(insert OID = 2229 (   405     text_pattern_ops    PGNSP PGUID   25 f 0 ));
+DATA(insert OID = 2230 (   405     varchar_pattern_ops PGNSP PGUID 1043 f 0 ));
+DATA(insert OID = 2231 (   405     bpchar_pattern_ops  PGNSP PGUID 1042 f 0 ));
+DATA(insert OID = 2232 (   405     name_pattern_ops    PGNSP PGUID   19 f 0 ));
  
  #endif   /* PG_OPCLASS_H */
diff --git a/src/include/catalog/pg_operator.h b/src/include/catalog/pg_operator.h

index ae4fb6e04bb978bbad0b2f2ca5cd9ac6b100f268..88d3d998a05529cb14111f7afc8fb585b9328430 100644 (file)
--- a/src/include/catalog/pg_operator.h
+++ b/src/include/catalog/pg_operator.h
@@ -8,7 +8,7 @@
   * Portions Copyright (c) 1996-2002, PostgreSQL Global Development Group
   * Portions Copyright (c) 1994, Regents of the University of California
   *
- * $Id: pg_operator.h,v 1.114 2003/05/26 00:11:27 tgl Exp $
+ * $Id: pg_operator.h,v 1.115 2003/06/22 22:04:55 tgl Exp $
   *
   * NOTES
   *   the genbki.sh script reads this file and generates .bki
@@ -122,10 +122,10 @@ DATA(insert OID = 374 (  "||"    PGNSP PGUID b f 2283 2277 2277   0 0 0 0 0 0 ar
  DATA(insert OID = 375 (  "||"     PGNSP PGUID b f 2277 2277 2277   0 0 0 0 0 0 array_cat      -       -     ));
  
  DATA(insert OID = 352 (  "="      PGNSP PGUID b t  28  28  16 352   0   0   0   0   0 xideq eqsel eqjoinsel ));
-DATA(insert OID = 353 (  "="      PGNSP PGUID b t  28  23  16   0   0   0   0   0   0 xideqint4 eqsel eqjoinsel ));
+DATA(insert OID = 353 (  "="      PGNSP PGUID b f  28  23  16   0   0   0   0   0   0 xideqint4 eqsel eqjoinsel ));
  DATA(insert OID = 385 (  "="      PGNSP PGUID b t  29  29  16 385   0   0   0   0   0 cideq eqsel eqjoinsel ));
  DATA(insert OID = 386 (  "="      PGNSP PGUID b t  22  22  16 386   0   0   0   0   0 int2vectoreq eqsel eqjoinsel ));
-DATA(insert OID = 387 (  "="      PGNSP PGUID b t  27  27  16 387   0   0   0   0   0 tideq eqsel eqjoinsel ));
+DATA(insert OID = 387 (  "="      PGNSP PGUID b f  27  27  16 387   0   0   0   0   0 tideq eqsel eqjoinsel ));
  #define TIDEqualOperator   387
  DATA(insert OID = 388 (  "!"      PGNSP PGUID r f  20   0  20   0   0   0   0   0   0 int8fac - - ));
  DATA(insert OID = 389 (  "!!"     PGNSP PGUID l f   0  20  20   0   0   0   0   0   0 int8fac - - ));
@@ -287,7 +287,7 @@ DATA(insert OID = 616 (  "<->"     PGNSP PGUID b f 601 628 701   0   0  0  0   0   0
  DATA(insert OID = 617 (  "<->"    PGNSP PGUID b f 601 603 701   0   0  0  0   0   0 dist_sb - - ));
  DATA(insert OID = 618 (  "<->"    PGNSP PGUID b f 600 602 701   0   0  0  0   0   0 dist_ppath - - ));
  
-DATA(insert OID = 620 (  "="      PGNSP PGUID b f  700  700    16 620 621  622 622 622 623 float4eq eqsel eqjoinsel ));
+DATA(insert OID = 620 (  "="      PGNSP PGUID b t  700  700    16 620 621  622 622 622 623 float4eq eqsel eqjoinsel ));
  DATA(insert OID = 621 (  "<>"     PGNSP PGUID b f  700  700    16 621 620  0 0   0   0 float4ne neqsel neqjoinsel ));
  DATA(insert OID = 622 (  "<"      PGNSP PGUID b f  700  700    16 623 625  0 0   0   0 float4lt scalarltsel scalarltjoinsel ));
  DATA(insert OID = 623 (  ">"      PGNSP PGUID b f  700  700    16 622 624  0 0   0   0 float4gt scalargtsel scalargtjoinsel ));
@@ -325,7 +325,7 @@ DATA(insert OID = 665 (  "<="      PGNSP PGUID b f  25  25  16 667 666  0 0 0 0 text_l
  DATA(insert OID = 666 (  ">"      PGNSP PGUID b f  25  25  16 664 665  0 0 0 0 text_gt scalargtsel scalargtjoinsel ));
  DATA(insert OID = 667 (  ">="     PGNSP PGUID b f  25  25  16 665 664  0 0 0 0 text_ge scalargtsel scalargtjoinsel ));
  
-DATA(insert OID = 670 (  "="      PGNSP PGUID b f  701  701    16 670 671 672 672 672 674 float8eq eqsel eqjoinsel ));
+DATA(insert OID = 670 (  "="      PGNSP PGUID b t  701  701    16 670 671 672 672 672 674 float8eq eqsel eqjoinsel ));
  DATA(insert OID = 671 (  "<>"     PGNSP PGUID b f  701  701    16 671 670  0 0   0   0 float8ne neqsel neqjoinsel ));
  DATA(insert OID = 672 (  "<"      PGNSP PGUID b f  701  701    16 674 675  0 0   0   0 float8lt scalarltsel scalarltjoinsel ));
  DATA(insert OID = 673 (  "<="     PGNSP PGUID b f  701  701    16 675 674  0 0   0   0 float8le scalarltsel scalarltjoinsel ));
@@ -403,7 +403,7 @@ DATA(insert OID = 843 (  "*"       PGNSP PGUID b f  790  700    790 845   0   0   0   0
  DATA(insert OID = 844 (  "/"      PGNSP PGUID b f  790  700    790   0   0   0   0   0   0 cash_div_flt4 - - ));
  DATA(insert OID = 845 (  "*"      PGNSP PGUID b f  700  790    790 843   0   0   0   0   0 flt4_mul_cash - - ));
  
-DATA(insert OID = 900 (  "="      PGNSP PGUID b t  790  790    16 900 901  902 902 902 903 cash_eq eqsel eqjoinsel ));
+DATA(insert OID = 900 (  "="      PGNSP PGUID b f  790  790    16 900 901  902 902 902 903 cash_eq eqsel eqjoinsel ));
  DATA(insert OID = 901 (  "<>"     PGNSP PGUID b f  790  790    16 901 900  0 0   0   0 cash_ne neqsel neqjoinsel ));
  DATA(insert OID = 902 (  "<"      PGNSP PGUID b f  790  790    16 903 905  0 0   0   0 cash_lt scalarltsel scalarltjoinsel ));
  DATA(insert OID = 903 (  ">"      PGNSP PGUID b f  790  790    16 902 904  0 0   0   0 cash_gt scalargtsel scalargtjoinsel ));
@@ -431,7 +431,7 @@ DATA(insert OID =  969 (  "@@"     PGNSP PGUID l f  0  601  600    0  0 0 0 0 0 lse
  DATA(insert OID =  970 (  "@@"    PGNSP PGUID l f  0  602  600    0  0 0 0 0 0 path_center - - ));
  DATA(insert OID =  971 (  "@@"    PGNSP PGUID l f  0  604  600    0  0 0 0 0 0 poly_center - - ));
  
-DATA(insert OID = 1054 ( "="      PGNSP PGUID b f 1042 1042     16 1054 1057 1058 1058 1058 1060 bpchareq eqsel eqjoinsel ));
+DATA(insert OID = 1054 ( "="      PGNSP PGUID b t 1042 1042     16 1054 1057 1058 1058 1058 1060 bpchareq eqsel eqjoinsel ));
  DATA(insert OID = 1055 ( "~"      PGNSP PGUID b f 1042 25   16    0 1056  0 0 0 0 bpcharregexeq regexeqsel regexeqjoinsel ));
  #define OID_BPCHAR_REGEXEQ_OP      1055
  DATA(insert OID = 1056 ( "!~"     PGNSP PGUID b f 1042 25   16    0 1055  0 0 0 0 bpcharregexne regexnesel regexnejoinsel ));
@@ -455,7 +455,7 @@ DATA(insert OID = 1100 ( "+"       PGNSP PGUID b f  1082      23 1082 0 0 0 0 0 0 date_
  DATA(insert OID = 1101 ( "-"      PGNSP PGUID b f  1082      23 1082 0 0 0 0 0 0 date_mii - - ));
  
  /* time operators */
-DATA(insert OID = 1108 ( "="      PGNSP PGUID b f  1083    1083  16 1108 1109 1110 1110 1110 1112 time_eq eqsel eqjoinsel ));
+DATA(insert OID = 1108 ( "="      PGNSP PGUID b t  1083    1083  16 1108 1109 1110 1110 1110 1112 time_eq eqsel eqjoinsel ));
  DATA(insert OID = 1109 ( "<>"     PGNSP PGUID b f  1083    1083  16 1109 1108  0 0   0   0 time_ne neqsel neqjoinsel ));
  DATA(insert OID = 1110 ( "<"      PGNSP PGUID b f  1083    1083  16 1112 1113  0 0   0   0 time_lt scalarltsel scalarltjoinsel ));
  DATA(insert OID = 1111 ( "<="     PGNSP PGUID b f  1083    1083  16 1113 1112  0 0   0   0 time_le scalarltsel scalarltjoinsel ));
@@ -465,7 +465,7 @@ DATA(insert OID = 1269 (  "-"      PGNSP PGUID b f  1186 1083 1083 0 0 0 0 0 0 inte
  
  /* timetz operators */
  DATA(insert OID = 1295 (  "-"     PGNSP PGUID b f  1186 1266 1266 0 0 0 0 0 0 interval_mi_timetz - - ));
-DATA(insert OID = 1550 ( "="      PGNSP PGUID b f  1266 1266   16 1550 1551 1552 1552 1552 1554 timetz_eq eqsel eqjoinsel ));
+DATA(insert OID = 1550 ( "="      PGNSP PGUID b t  1266 1266   16 1550 1551 1552 1552 1552 1554 timetz_eq eqsel eqjoinsel ));
  DATA(insert OID = 1551 ( "<>"     PGNSP PGUID b f  1266 1266   16 1551 1550    0 0   0   0 timetz_ne neqsel neqjoinsel ));
  DATA(insert OID = 1552 ( "<"      PGNSP PGUID b f  1266 1266   16 1554 1555    0 0   0   0 timetz_lt scalarltsel scalarltjoinsel ));
  DATA(insert OID = 1553 ( "<="     PGNSP PGUID b f  1266 1266   16 1555 1554    0 0   0   0 timetz_le scalarltsel scalarltjoinsel ));
@@ -522,7 +522,7 @@ DATA(insert OID = 1234 (  "~*"      PGNSP PGUID b f  1042  25  16 0 1235    0 0   0   0
  DATA(insert OID = 1235 ( "!~*"     PGNSP PGUID b f  1042  25  16 0 1234    0 0   0   0 bpcharicregexne icregexnesel icregexnejoinsel ));
  
  /* timestamptz operators */
-DATA(insert OID = 1320 (  "="     PGNSP PGUID b f 1184 1184     16 1320 1321 1322 1322 1322 1324 timestamptz_eq eqsel eqjoinsel ));
+DATA(insert OID = 1320 (  "="     PGNSP PGUID b t 1184 1184     16 1320 1321 1322 1322 1322 1324 timestamptz_eq eqsel eqjoinsel ));
  DATA(insert OID = 1321 (  "<>"    PGNSP PGUID b f 1184 1184     16 1321 1320 0 0 0 0 timestamptz_ne neqsel neqjoinsel ));
  DATA(insert OID = 1322 (  "<"     PGNSP PGUID b f 1184 1184     16 1324 1325 0 0 0 0 timestamptz_lt scalarltsel scalarltjoinsel ));
  DATA(insert OID = 1323 (  "<="    PGNSP PGUID b f 1184 1184     16 1325 1324 0 0 0 0 timestamptz_le scalarltsel scalarltjoinsel ));
@@ -533,7 +533,7 @@ DATA(insert OID = 1328 (  "-"      PGNSP PGUID b f 1184 1184 1186    0  0 0 0 0 0 tim
  DATA(insert OID = 1329 (  "-"     PGNSP PGUID b f 1184 1186 1184    0  0 0 0 0 0 timestamptz_mi_span - - ));
  
  /* interval operators */
-DATA(insert OID = 1330 (  "="     PGNSP PGUID b f 1186 1186     16 1330 1331 1332 1332 1332 1334 interval_eq eqsel eqjoinsel ));
+DATA(insert OID = 1330 (  "="     PGNSP PGUID b t 1186 1186     16 1330 1331 1332 1332 1332 1334 interval_eq eqsel eqjoinsel ));
  DATA(insert OID = 1331 (  "<>"    PGNSP PGUID b f 1186 1186     16 1331 1330 0 0 0 0 interval_ne neqsel neqjoinsel ));
  DATA(insert OID = 1332 (  "<"     PGNSP PGUID b f 1186 1186     16 1334 1335 0 0 0 0 interval_lt scalarltsel scalarltjoinsel ));
  DATA(insert OID = 1333 (  "<="    PGNSP PGUID b f 1186 1186     16 1335 1334 0 0 0 0 interval_le scalarltsel scalarltjoinsel ));
@@ -630,7 +630,7 @@ DATA(insert OID = 1616 (  "="     PGNSP PGUID b f  628  628 16 1616  0 0 0 0 0 line
  DATA(insert OID = 1617 (  "#"    PGNSP PGUID b f  628  628  600 1617  0 0 0 0 0 line_interpt - - ));
  
  /* MAC type */
-DATA(insert OID = 1220 (  "="     PGNSP PGUID b f 829 829   16 1220 1221 1222 1222 1222 1224 macaddr_eq eqsel eqjoinsel ));
+DATA(insert OID = 1220 (  "="     PGNSP PGUID b t 829 829   16 1220 1221 1222 1222 1222 1224 macaddr_eq eqsel eqjoinsel ));
  DATA(insert OID = 1221 (  "<>"    PGNSP PGUID b f 829 829   16 1221 1220    0    0   0   0 macaddr_ne neqsel neqjoinsel ));
  DATA(insert OID = 1222 (  "<"     PGNSP PGUID b f 829 829   16 1224 1225    0    0   0   0 macaddr_lt scalarltsel scalarltjoinsel ));
  DATA(insert OID = 1223 (  "<="    PGNSP PGUID b f 829 829   16 1225 1224    0    0   0   0 macaddr_le scalarltsel scalarltjoinsel ));
@@ -638,7 +638,7 @@ DATA(insert OID = 1224 (  ">"      PGNSP PGUID b f 829 829   16 1222 1223    0    0
  DATA(insert OID = 1225 (  ">="    PGNSP PGUID b f 829 829   16 1223 1222    0    0   0   0 macaddr_ge scalargtsel scalargtjoinsel ));
  
  /* INET type */
-DATA(insert OID = 1201 (  "="     PGNSP PGUID b f 869 869   16 1201 1202 1203 1203 1203 1205 network_eq eqsel eqjoinsel ));
+DATA(insert OID = 1201 (  "="     PGNSP PGUID b t 869 869   16 1201 1202 1203 1203 1203 1205 network_eq eqsel eqjoinsel ));
  DATA(insert OID = 1202 (  "<>"    PGNSP PGUID b f 869 869   16 1202 1201    0    0   0   0 network_ne neqsel neqjoinsel ));
  DATA(insert OID = 1203 (  "<"     PGNSP PGUID b f 869 869   16 1205 1206    0    0   0   0 network_lt scalarltsel scalarltjoinsel ));
  DATA(insert OID = 1204 (  "<="    PGNSP PGUID b f 869 869   16 1206 1205    0    0   0   0 network_le scalarltsel scalarltjoinsel ));
@@ -654,7 +654,7 @@ DATA(insert OID = 934  (  ">>="    PGNSP PGUID b f 869 869   16 932     0    0    0   0
  #define OID_INET_SUPEQ_OP              934
  
  /* CIDR type */
-DATA(insert OID = 820 (  "="      PGNSP PGUID b f 650 650   16 820 821 822 822 822 824 network_eq eqsel eqjoinsel ));
+DATA(insert OID = 820 (  "="      PGNSP PGUID b t 650 650   16 820 821 822 822 822 824 network_eq eqsel eqjoinsel ));
  DATA(insert OID = 821 (  "<>"     PGNSP PGUID b f 650 650   16 821 820   0   0   0   0 network_ne neqsel neqjoinsel ));
  DATA(insert OID = 822 (  "<"      PGNSP PGUID b f 650 650   16 824 825   0   0   0   0 network_lt scalarltsel scalarltjoinsel ));
  DATA(insert OID = 823 (  "<="     PGNSP PGUID b f 650 650   16 825 824   0   0   0   0 network_le scalarltsel scalarltjoinsel ));
@@ -778,7 +778,7 @@ DATA(insert OID = 2017 (  "!~~"    PGNSP PGUID b f 17 17    16 0    2016 0    0   0   0
  DATA(insert OID = 2018 (  "||"    PGNSP PGUID b f 17 17    17 0    0    0    0   0   0 byteacat - - ));
  
  /* timestamp operators */
-DATA(insert OID = 2060 (  "="     PGNSP PGUID b f 1114 1114     16 2060 2061 2062 2062 2062 2064 timestamp_eq eqsel eqjoinsel ));
+DATA(insert OID = 2060 (  "="     PGNSP PGUID b t 1114 1114     16 2060 2061 2062 2062 2062 2064 timestamp_eq eqsel eqjoinsel ));
  DATA(insert OID = 2061 (  "<>"    PGNSP PGUID b f 1114 1114     16 2061 2060 0 0 0 0 timestamp_ne neqsel neqjoinsel ));
  DATA(insert OID = 2062 (  "<"     PGNSP PGUID b f 1114 1114     16 2064 2065 0 0 0 0 timestamp_lt scalarltsel scalarltjoinsel ));
  DATA(insert OID = 2063 (  "<="    PGNSP PGUID b f 1114 1114     16 2065 2064 0 0 0 0 timestamp_le scalarltsel scalarltjoinsel ));
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h

index d8ce41c6a8e661b9c1b754e9ff1e87658002fdfc..6cf727cc64b6ea6b88dcf6f83ca580b41c68a885 100644 (file)
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -7,7 +7,7 @@
   * Portions Copyright (c) 1996-2002, PostgreSQL Global Development Group
   * Portions Copyright (c) 1994, Regents of the University of California
   *
- * $Id: pg_proc.h,v 1.303 2003/06/11 09:23:55 petere Exp $
+ * $Id: pg_proc.h,v 1.304 2003/06/22 22:04:55 tgl Exp $
   *
   * NOTES
   *   The script catalog/genbki.sh reads this file and generates .bki
@@ -836,10 +836,14 @@ DATA(insert OID = 454 (  hashchar        PGNSP PGUID 12 f f t f i 1 23 "18"  hashch
  DESCR("hash");
  DATA(insert OID = 455 (  hashname         PGNSP PGUID 12 f f t f i 1 23 "19"  hashname - _null_ ));
  DESCR("hash");
+DATA(insert OID = 400 (  hashtext         PGNSP PGUID 12 f f t f i 1 23 "25" hashtext - _null_ ));
+DESCR("hash");
  DATA(insert OID = 456 (  hashvarlena      PGNSP PGUID 12 f f t f i 1 23 "2281" hashvarlena - _null_ ));
  DESCR("hash any varlena type");
  DATA(insert OID = 457 (  hashoidvector    PGNSP PGUID 12 f f t f i 1 23 "30"  hashoidvector - _null_ ));
  DESCR("hash");
+DATA(insert OID = 398 (  hashint2vector       PGNSP PGUID 12 f f t f i 1 23 "22"  hashint2vector - _null_ ));
+DESCR("hash");
  DATA(insert OID = 399 (  hashmacaddr      PGNSP PGUID 12 f f t f i 1 23 "829"  hashmacaddr - _null_ ));
  DESCR("hash");
  DATA(insert OID = 458 (  text_larger      PGNSP PGUID 12 f f t f i 2 25 "25 25"    text_larger - _null_ ));
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h

index 707b8c9fa62157d71a04521f55691066b5edb3b4..04e630451f2f2b011fc022a9aeb3cdfa3def35f9 100644 (file)
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -7,7 +7,7 @@
   * Portions Copyright (c) 1996-2002, PostgreSQL Global Development Group
   * Portions Copyright (c) 1994, Regents of the University of California
   *
- * $Id: executor.h,v 1.94 2003/05/06 20:26:28 tgl Exp $
+ * $Id: executor.h,v 1.95 2003/06/22 22:04:55 tgl Exp $
   *
   *-------------------------------------------------------------------------
   */
@@ -57,9 +57,14 @@ extern bool execTuplesUnequal(HeapTuple tuple1,
  extern FmgrInfo *execTuplesMatchPrepare(TupleDesc tupdesc,
                        int numCols,
                        AttrNumber *matchColIdx);
-extern uint32 ComputeHashFunc(Datum key, int typLen, bool byVal);
+extern void execTuplesHashPrepare(TupleDesc tupdesc,
+                                 int numCols,
+                                 AttrNumber *matchColIdx,
+                                 FmgrInfo **eqfunctions,
+                                 FmgrInfo **hashfunctions);
  extern TupleHashTable BuildTupleHashTable(int numCols, AttrNumber *keyColIdx,
                                           FmgrInfo *eqfunctions,
+                                         FmgrInfo *hashfunctions,
                                           int nbuckets, Size entrysize,
                                           MemoryContext tablecxt,
                                           MemoryContext tempcxt);
diff --git a/src/include/executor/hashjoin.h b/src/include/executor/hashjoin.h

index a2d5f633fcd198559586672ea166a4805ee36943..da7e0bf98c47874498fa90fda8afdbd726a12798 100644 (file)
--- a/src/include/executor/hashjoin.h
+++ b/src/include/executor/hashjoin.h
@@ -7,7 +7,7 @@
   * Portions Copyright (c) 1996-2002, PostgreSQL Global Development Group
   * Portions Copyright (c) 1994, Regents of the University of California
   *
- * $Id: hashjoin.h,v 1.27 2002/11/30 00:08:20 tgl Exp $
+ * $Id: hashjoin.h,v 1.28 2003/06/22 22:04:55 tgl Exp $
   *
   *-------------------------------------------------------------------------
   */
@@ -69,13 +69,12 @@ typedef struct HashTableData
                                  * file */
  
     /*
-    * Info about the datatypes being hashed.  We assume that the inner and
-    * outer sides of each hashclause are the same type, or at least
-    * binary-compatible types.  Each of these fields points to an array
-    * of the same length as the number of hash keys.
+    * Info about the datatype-specific hash functions for the datatypes
+    * being hashed.  We assume that the inner and outer sides of each
+    * hashclause are the same type, or at least share the same hash function.
+    * This is an array of the same length as the number of hash keys.
      */
-   int16      *typLens;
-   bool       *typByVals;
+   FmgrInfo   *hashfunctions;  /* lookup data for hash functions */
  
     /*
      * During 1st scan of inner relation, we get tuples from executor. If
diff --git a/src/include/executor/nodeHash.h b/src/include/executor/nodeHash.h

index da1113b32daf13005270432d1c6ad24805b903f9..d6d7ea627ead9bf49e23075a5987debbc0e0549b 100644 (file)
--- a/src/include/executor/nodeHash.h
+++ b/src/include/executor/nodeHash.h
@@ -7,7 +7,7 @@
   * Portions Copyright (c) 1996-2002, PostgreSQL Global Development Group
   * Portions Copyright (c) 1994, Regents of the University of California
   *
- * $Id: nodeHash.h,v 1.29 2003/01/10 23:54:24 tgl Exp $
+ * $Id: nodeHash.h,v 1.30 2003/06/22 22:04:55 tgl Exp $
   *
   *-------------------------------------------------------------------------
   */
@@ -22,7 +22,7 @@ extern TupleTableSlot *ExecHash(HashState *node);
  extern void ExecEndHash(HashState *node);
  extern void ExecReScanHash(HashState *node, ExprContext *exprCtxt);
  
-extern HashJoinTable ExecHashTableCreate(Hash *node);
+extern HashJoinTable ExecHashTableCreate(Hash *node, List *hashOperators);
  extern void ExecHashTableDestroy(HashJoinTable hashtable);
  extern void ExecHashTableInsert(HashJoinTable hashtable,
                     ExprContext *econtext,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h

index 68a3bb9b1cd60908bc0cebd5763b4159fd33fcc0..47879296c0eb8e1b16c4f5c17bcadf51c77e71ba 100644 (file)
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -7,7 +7,7 @@
   * Portions Copyright (c) 1996-2002, PostgreSQL Global Development Group
   * Portions Copyright (c) 1994, Regents of the University of California
   *
- * $Id: execnodes.h,v 1.98 2003/05/28 16:04:00 tgl Exp $
+ * $Id: execnodes.h,v 1.99 2003/06/22 22:04:55 tgl Exp $
   *
   *-------------------------------------------------------------------------
   */
@@ -353,6 +353,7 @@ typedef struct TupleHashTableData
     int         numCols;        /* number of columns in lookup key */
     AttrNumber *keyColIdx;      /* attr numbers of key columns */
     FmgrInfo   *eqfunctions;    /* lookup data for comparison functions */
+   FmgrInfo   *hashfunctions;  /* lookup data for hash functions */
     MemoryContext tablecxt;     /* memory context containing table */
     MemoryContext tempcxt;      /* context for function evaluations */
     Size        entrysize;      /* actual size to make each hash entry */
@@ -521,6 +522,7 @@ typedef struct SubPlanState
     ExprContext *innerecontext; /* working context for comparisons */
     AttrNumber *keyColIdx;      /* control data for hash tables */
     FmgrInfo   *eqfunctions;    /* comparison functions for hash tables */
+   FmgrInfo   *hashfunctions;  /* lookup data for hash functions */
  } SubPlanState;
  
  /* ----------------
@@ -900,6 +902,7 @@ typedef struct MergeJoinState
   *                              unless OuterTupleSlot is nonempty!)
   *     hj_OuterHashKeys        the outer hash keys in the hashjoin condition
   *     hj_InnerHashKeys        the inner hash keys in the hashjoin condition
+ *     hj_HashOperators        the join operators in the hashjoin condition
   *     hj_OuterTupleSlot       tuple slot for outer tuples
   *     hj_HashTupleSlot        tuple slot for hashed tuples
   *     hj_NullInnerTupleSlot   prepared null tuple for left outer joins
@@ -917,6 +920,7 @@ typedef struct HashJoinState
     HashJoinTuple hj_CurTuple;
     List       *hj_OuterHashKeys;   /* list of ExprState nodes */
     List       *hj_InnerHashKeys;   /* list of ExprState nodes */
+   List       *hj_HashOperators;   /* list of operator OIDs */
     TupleTableSlot *hj_OuterTupleSlot;
     TupleTableSlot *hj_HashTupleSlot;
     TupleTableSlot *hj_NullInnerTupleSlot;
@@ -992,6 +996,7 @@ typedef struct AggState
     List       *aggs;           /* all Aggref nodes in targetlist & quals */
     int         numaggs;        /* length of list (could be zero!) */
     FmgrInfo   *eqfunctions;    /* per-grouping-field equality fns */
+   FmgrInfo   *hashfunctions;  /* per-grouping-field hash fns */
     AggStatePerAgg peragg;      /* per-Aggref information */
     MemoryContext aggcontext;   /* memory context for long-lived data */
     ExprContext *tmpcontext;    /* econtext for input expressions */
diff --git a/src/include/utils/lsyscache.h b/src/include/utils/lsyscache.h

index 66b497a98b14fd3c975f61cd2145f3a9d08a2b69..878f5445c2a768506afccdc4ac53c8a2c41970a3 100644 (file)
--- a/src/include/utils/lsyscache.h
+++ b/src/include/utils/lsyscache.h
@@ -6,7 +6,7 @@
   * Portions Copyright (c) 1996-2002, PostgreSQL Global Development Group
   * Portions Copyright (c) 1994, Regents of the University of California
   *
- * $Id: lsyscache.h,v 1.70 2003/05/26 00:11:28 tgl Exp $
+ * $Id: lsyscache.h,v 1.71 2003/06/22 22:04:55 tgl Exp $
   *
   *-------------------------------------------------------------------------
   */
@@ -18,6 +18,7 @@
  extern bool op_in_opclass(Oid opno, Oid opclass);
  extern bool op_requires_recheck(Oid opno, Oid opclass);
  extern Oid get_opclass_member(Oid opclass, int16 strategy);
+extern Oid get_op_hash_function(Oid opno);
  extern char *get_attname(Oid relid, AttrNumber attnum);
  extern AttrNumber get_attnum(Oid relid, const char *attname);
  extern Oid get_atttype(Oid relid, AttrNumber attnum);
@@ -25,6 +26,7 @@ extern int32 get_atttypmod(Oid relid, AttrNumber attnum);
  extern void get_atttypetypmod(Oid relid, AttrNumber attnum,
                   Oid *typid, int32 *typmod);
  extern bool opclass_is_btree(Oid opclass);
+extern bool opclass_is_hash(Oid opclass);
  extern RegProcedure get_opcode(Oid opno);
  extern char *get_opname(Oid opno);
  extern bool op_mergejoinable(Oid opno, Oid *leftOp, Oid *rightOp);
diff --git a/src/test/regress/expected/opr_sanity.out b/src/test/regress/expected/opr_sanity.out

index e0504706f3c1c2a5f50d465cb13199b3247a99d1..985e06595e74c1731dcd525a9908642ee8297b1e 100644 (file)
--- a/src/test/regress/expected/opr_sanity.out
+++ b/src/test/regress/expected/opr_sanity.out
@@ -436,9 +436,6 @@ WHERE p1.oprlsortop != p1.oprrsortop AND
  -- Hashing only works on simple equality operators "type = sametype",
  -- since the hash itself depends on the bitwise representation of the type.
  -- Check that allegedly hashable operators look like they might be "=".
--- NOTE: as of 7.3, this search finds xideqint4.  Since we do not mark
--- xid and int4 as binary-equivalent in pg_cast, there's no easy way to
--- recognize that case as OK; just leave that tuple in the expected output.
  SELECT p1.oid, p1.oprname
  FROM pg_operator AS p1
  WHERE p1.oprcanhash AND NOT
@@ -447,8 +444,7 @@ WHERE p1.oprcanhash AND NOT
       p1.oprcom = p1.oid);
   oid | oprname 
  -----+---------
- 353 | =
-(1 row)
+(0 rows)
  
  -- In 6.5 we accepted hashable array equality operators when the array element
  -- type is hashable.  However, what we actually need to make hashjoin work on
@@ -474,6 +470,17 @@ WHERE p1.oprcanhash AND p1.oprcode = p2.oid AND p2.proname = 'array_eq';
  -----+---------
  (0 rows)
  
+-- Hashable operators should appear as members of hash index opclasses.
+SELECT p1.oid, p1.oprname
+FROM pg_operator AS p1
+WHERE p1.oprcanhash AND NOT EXISTS
+  (SELECT 1 FROM pg_opclass op JOIN pg_amop p ON op.oid = amopclaid
+   WHERE opcamid = (SELECT oid FROM pg_am WHERE amname = 'hash') AND
+         amopopr = p1.oid);
+ oid | oprname 
+-----+---------
+(0 rows)
+
  -- Check that each operator defined in pg_operator matches its oprcode entry
  -- in pg_proc.  Easiest to do this separately for each oprkind.
  SELECT p1.oid, p1.oprname, p2.oid, p2.proname
@@ -793,3 +800,24 @@ WHERE p3.opcamid = (SELECT oid FROM pg_am WHERE amname = 'btree')
  -----------+-----------+-----+---------+---------
  (0 rows)
  
+-- For hash we can also do a little better: the support routines must be
+-- of the form hash(something) returns int4.  Ideally we'd check that the
+-- opcintype is binary-coercible to the function's input, but there are
+-- enough cases where that fails that I'll just leave out the check for now.
+SELECT p1.amopclaid, p1.amprocnum,
+   p2.oid, p2.proname,
+   p3.opcname
+FROM pg_amproc AS p1, pg_proc AS p2, pg_opclass AS p3
+WHERE p3.opcamid = (SELECT oid FROM pg_am WHERE amname = 'hash')
+    AND p1.amopclaid = p3.oid AND p1.amproc = p2.oid AND
+    (opckeytype != 0
+     OR amprocnum != 1
+     OR proretset
+     OR prorettype != 23
+     OR pronargs != 1
+--   OR NOT physically_coercible(opcintype, proargtypes[0])
+);
+ amopclaid | amprocnum | oid | proname | opcname 
+-----------+-----------+-----+---------+---------
+(0 rows)
+
diff --git a/src/test/regress/sql/opr_sanity.sql b/src/test/regress/sql/opr_sanity.sql

index 4b7bd7b4dd5c7b100cbe7d2aa7eaf5049e4ec226..9cba48b4296d515f8710b9f34bb613c11edf691b 100644 (file)
--- a/src/test/regress/sql/opr_sanity.sql
+++ b/src/test/regress/sql/opr_sanity.sql
@@ -364,10 +364,6 @@ WHERE p1.oprlsortop != p1.oprrsortop AND
  -- since the hash itself depends on the bitwise representation of the type.
  -- Check that allegedly hashable operators look like they might be "=".
  
--- NOTE: as of 7.3, this search finds xideqint4.  Since we do not mark
--- xid and int4 as binary-equivalent in pg_cast, there's no easy way to
--- recognize that case as OK; just leave that tuple in the expected output.
-
  SELECT p1.oid, p1.oprname
  FROM pg_operator AS p1
  WHERE p1.oprcanhash AND NOT
@@ -398,6 +394,16 @@ SELECT p1.oid, p1.oprname
  FROM pg_operator AS p1, pg_proc AS p2
  WHERE p1.oprcanhash AND p1.oprcode = p2.oid AND p2.proname = 'array_eq';
  
+-- Hashable operators should appear as members of hash index opclasses.
+
+SELECT p1.oid, p1.oprname
+FROM pg_operator AS p1
+WHERE p1.oprcanhash AND NOT EXISTS
+  (SELECT 1 FROM pg_opclass op JOIN pg_amop p ON op.oid = amopclaid
+   WHERE opcamid = (SELECT oid FROM pg_am WHERE amname = 'hash') AND
+         amopopr = p1.oid);
+
+
  -- Check that each operator defined in pg_operator matches its oprcode entry
  -- in pg_proc.  Easiest to do this separately for each oprkind.
  
@@ -665,3 +671,22 @@ WHERE p3.opcamid = (SELECT oid FROM pg_am WHERE amname = 'btree')
       OR pronargs != 2
       OR NOT binary_coercible(opcintype, proargtypes[0])
       OR proargtypes[0] != proargtypes[1]);
+
+-- For hash we can also do a little better: the support routines must be
+-- of the form hash(something) returns int4.  Ideally we'd check that the
+-- opcintype is binary-coercible to the function's input, but there are
+-- enough cases where that fails that I'll just leave out the check for now.
+
+SELECT p1.amopclaid, p1.amprocnum,
+   p2.oid, p2.proname,
+   p3.opcname
+FROM pg_amproc AS p1, pg_proc AS p2, pg_opclass AS p3
+WHERE p3.opcamid = (SELECT oid FROM pg_am WHERE amname = 'hash')
+    AND p1.amopclaid = p3.oid AND p1.amproc = p2.oid AND
+    (opckeytype != 0
+     OR amprocnum != 1
+     OR proretset
+     OR prorettype != 23
+     OR pronargs != 1
+--   OR NOT physically_coercible(opcintype, proargtypes[0])
+);
author	Tom Lane <tgl@sss.pgh.pa.us>
	Sun, 22 Jun 2003 22:04:55 +0000 (22:04 +0000)
committer	Tom Lane <tgl@sss.pgh.pa.us>
	Sun, 22 Jun 2003 22:04:55 +0000 (22:04 +0000)
doc/src/sgml/catalogs.sgml		patch \| blob \| blame \| history
doc/src/sgml/xfunc.sgml		patch \| blob \| blame \| history
doc/src/sgml/xoper.sgml		patch \| blob \| blame \| history
src/backend/access/hash/hashfunc.c		patch \| blob \| blame \| history
src/backend/executor/execGrouping.c		patch \| blob \| blame \| history
src/backend/executor/nodeAgg.c		patch \| blob \| blame \| history
src/backend/executor/nodeHash.c		patch \| blob \| blame \| history
src/backend/executor/nodeHashjoin.c		patch \| blob \| blame \| history
src/backend/executor/nodeSubplan.c		patch \| blob \| blame \| history
src/backend/utils/adt/varchar.c		patch \| blob \| blame \| history
src/backend/utils/cache/catcache.c		patch \| blob \| blame \| history
src/backend/utils/cache/lsyscache.c		patch \| blob \| blame \| history
src/include/access/hash.h		patch \| blob \| blame \| history
src/include/catalog/catversion.h		patch \| blob \| blame \| history
src/include/catalog/pg_am.h		patch \| blob \| blame \| history
src/include/catalog/pg_amop.h		patch \| blob \| blame \| history
src/include/catalog/pg_amproc.h		patch \| blob \| blame \| history
src/include/catalog/pg_opclass.h		patch \| blob \| blame \| history
src/include/catalog/pg_operator.h		patch \| blob \| blame \| history
src/include/catalog/pg_proc.h		patch \| blob \| blame \| history
src/include/executor/executor.h		patch \| blob \| blame \| history
src/include/executor/hashjoin.h		patch \| blob \| blame \| history
src/include/executor/nodeHash.h		patch \| blob \| blame \| history
src/include/nodes/execnodes.h		patch \| blob \| blame \| history
src/include/utils/lsyscache.h		patch \| blob \| blame \| history
src/test/regress/expected/opr_sanity.out		patch \| blob \| blame \| history
src/test/regress/sql/opr_sanity.sql		patch \| blob \| blame \| history