summaryrefslogtreecommitdiff
path: root/doc/src
diff options
context:
space:
mode:
Diffstat (limited to 'doc/src')
-rw-r--r--doc/src/sgml/btree.sgml96
-rw-r--r--doc/src/sgml/ref/alter_opfamily.sgml7
-rw-r--r--doc/src/sgml/ref/create_opclass.sgml14
-rw-r--r--doc/src/sgml/xindex.sgml18
4 files changed, 121 insertions, 14 deletions
diff --git a/doc/src/sgml/btree.sgml b/doc/src/sgml/btree.sgml
index ac6c4423e60..fcf771c857f 100644
--- a/doc/src/sgml/btree.sgml
+++ b/doc/src/sgml/btree.sgml
@@ -207,7 +207,7 @@
<para>
As shown in <xref linkend="xindex-btree-support-table"/>, btree defines
- one required and two optional support functions. The three
+ one required and three optional support functions. The four
user-defined methods are:
</para>
<variablelist>
@@ -456,6 +456,100 @@ returns bool
</para>
</listitem>
</varlistentry>
+ <varlistentry>
+ <term><function>equalimage</function></term>
+ <listitem>
+ <para>
+ Optionally, a btree operator family may provide
+ <function>equalimage</function> (<quote>equality implies image
+ equality</quote>) support functions, registered under support
+ function number 4. These functions allow the core code to
+ determine when it is safe to apply the btree deduplication
+ optimization. Currently, <function>equalimage</function>
+ functions are only called when building or rebuilding an index.
+ </para>
+ <para>
+ An <function>equalimage</function> function must have the
+ signature
+<synopsis>
+equalimage(<replaceable>opcintype</replaceable> <type>oid</type>) returns bool
+</synopsis>
+ The return value is static information about an operator class
+ and collation. Returning <literal>true</literal> indicates that
+ the <function>order</function> function for the operator class is
+ guaranteed to only return <literal>0</literal> (<quote>arguments
+ are equal</quote>) when its <replaceable>A</replaceable> and
+ <replaceable>B</replaceable> arguments are also interchangeable
+ without any loss of semantic information. Not registering an
+ <function>equalimage</function> function or returning
+ <literal>false</literal> indicates that this condition cannot be
+ assumed to hold.
+ </para>
+ <para>
+ The <replaceable>opcintype</replaceable> argument is the
+ <literal><structname>pg_type</structname>.oid</literal> of the
+ data type that the operator class indexes. This is a convenience
+ that allows reuse of the same underlying
+ <function>equalimage</function> function across operator classes.
+ If <replaceable>opcintype</replaceable> is a collatable data
+ type, the appropriate collation OID will be passed to the
+ <function>equalimage</function> function, using the standard
+ <function>PG_GET_COLLATION()</function> mechanism.
+ </para>
+ <para>
+ As far as the operator class is concerned, returning
+ <literal>true</literal> indicates that deduplication is safe (or
+ safe for the collation whose OID was passed to its
+ <function>equalimage</function> function). However, the core
+ code will only deem deduplication safe for an index when
+ <emphasis>every</emphasis> indexed column uses an operator class
+ that registers an <function>equalimage</function> function, and
+ each function actually returns <literal>true</literal> when
+ called.
+ </para>
+ <para>
+ Image equality is <emphasis>almost</emphasis> the same condition
+ as simple bitwise equality. There is one subtle difference: When
+ indexing a varlena data type, the on-disk representation of two
+ image equal datums may not be bitwise equal due to inconsistent
+ application of <acronym>TOAST</acronym> compression on input.
+ Formally, when an operator class's
+ <function>equalimage</function> function returns
+ <literal>true</literal>, it is safe to assume that the
+ <literal>datum_image_eq()</literal> C function will always agree
+ with the operator class's <function>order</function> function
+ (provided that the same collation OID is passed to both the
+ <function>equalimage</function> and <function>order</function>
+ functions).
+ </para>
+ <para>
+ The core code is fundamentally unable to deduce anything about
+ the <quote>equality implies image equality</quote> status of an
+ operator class within a multiple-data-type family based on
+ details from other operator classes in the same family. Also, it
+ is not sensible for an operator family to register a cross-type
+ <function>equalimage</function> function, and attempting to do so
+ will result in an error. This is because <quote>equality implies
+ image equality</quote> status does not just depend on
+ sorting/equality semantics, which are more or less defined at the
+ operator family level. In general, the semantics that one
+ particular data type implements must be considered separately.
+ </para>
+ <para>
+ The convention followed by the operator classes included with the
+ core <productname>PostgreSQL</productname> distribution is to
+ register a stock, generic <function>equalimage</function>
+ function. Most operator classes register
+ <function>btequalimage()</function>, which indicates that
+ deduplication is safe unconditionally. Operator classes for
+ collatable data types such as <type>text</type> register
+ <function>btvarstrequalimage()</function>, which indicates that
+ deduplication is safe with deterministic collations. Best
+ practice for third-party extensions is to register their own
+ custom function to retain control.
+ </para>
+ </listitem>
+ </varlistentry>
</variablelist>
</sect1>
diff --git a/doc/src/sgml/ref/alter_opfamily.sgml b/doc/src/sgml/ref/alter_opfamily.sgml
index 848156c9d7d..4ac1cca95a3 100644
--- a/doc/src/sgml/ref/alter_opfamily.sgml
+++ b/doc/src/sgml/ref/alter_opfamily.sgml
@@ -153,9 +153,10 @@ ALTER OPERATOR FAMILY <replaceable>name</replaceable> USING <replaceable class="
and hash functions it is not necessary to specify <replaceable
class="parameter">op_type</replaceable> since the function's input
data type(s) are always the correct ones to use. For B-tree sort
- support functions and all functions in GiST, SP-GiST and GIN operator
- classes, it is necessary to specify the operand data type(s) the function
- is to be used with.
+ support functions, B-Tree equal image functions, and all
+ functions in GiST, SP-GiST and GIN operator classes, it is
+ necessary to specify the operand data type(s) the function is to
+ be used with.
</para>
<para>
diff --git a/doc/src/sgml/ref/create_opclass.sgml b/doc/src/sgml/ref/create_opclass.sgml
index dd5252fd976..f42fb6494c6 100644
--- a/doc/src/sgml/ref/create_opclass.sgml
+++ b/doc/src/sgml/ref/create_opclass.sgml
@@ -171,12 +171,14 @@ CREATE OPERATOR CLASS <replaceable class="parameter">name</replaceable> [ DEFAUL
function is intended to support, if different from
the input data type(s) of the function (for B-tree comparison functions
and hash functions)
- or the class's data type (for B-tree sort support functions and all
- functions in GiST, SP-GiST, GIN and BRIN operator classes). These defaults
- are correct, and so <replaceable
- class="parameter">op_type</replaceable> need not be specified in
- <literal>FUNCTION</literal> clauses, except for the case of a B-tree sort
- support function that is meant to support cross-data-type comparisons.
+ or the class's data type (for B-tree sort support functions,
+ B-tree equal image functions, and all functions in GiST,
+ SP-GiST, GIN and BRIN operator classes). These defaults are
+ correct, and so <replaceable
+ class="parameter">op_type</replaceable> need not be specified
+ in <literal>FUNCTION</literal> clauses, except for the case of a
+ B-tree sort support function that is meant to support
+ cross-data-type comparisons.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/xindex.sgml b/doc/src/sgml/xindex.sgml
index ffb5164aaa0..2e06ad01bf5 100644
--- a/doc/src/sgml/xindex.sgml
+++ b/doc/src/sgml/xindex.sgml
@@ -402,7 +402,7 @@
<para>
B-trees require a comparison support function,
- and allow two additional support functions to be
+ and allow three additional support functions to be
supplied at the operator class author's option, as shown in <xref
linkend="xindex-btree-support-table"/>.
The requirements for these support functions are explained further in
@@ -441,6 +441,13 @@
</entry>
<entry>3</entry>
</row>
+ <row>
+ <entry>
+ Determine if it is safe for indexes that use the operator
+ class to apply the btree deduplication optimization (optional)
+ </entry>
+ <entry>4</entry>
+ </row>
</tbody>
</tgroup>
</table>
@@ -980,7 +987,8 @@ DEFAULT FOR TYPE int8 USING btree FAMILY integer_ops AS
OPERATOR 5 > ,
FUNCTION 1 btint8cmp(int8, int8) ,
FUNCTION 2 btint8sortsupport(internal) ,
- FUNCTION 3 in_range(int8, int8, int8, boolean, boolean) ;
+ FUNCTION 3 in_range(int8, int8, int8, boolean, boolean) ,
+ FUNCTION 4 btequalimage(oid) ;
CREATE OPERATOR CLASS int4_ops
DEFAULT FOR TYPE int4 USING btree FAMILY integer_ops AS
@@ -992,7 +1000,8 @@ DEFAULT FOR TYPE int4 USING btree FAMILY integer_ops AS
OPERATOR 5 > ,
FUNCTION 1 btint4cmp(int4, int4) ,
FUNCTION 2 btint4sortsupport(internal) ,
- FUNCTION 3 in_range(int4, int4, int4, boolean, boolean) ;
+ FUNCTION 3 in_range(int4, int4, int4, boolean, boolean) ,
+ FUNCTION 4 btequalimage(oid) ;
CREATE OPERATOR CLASS int2_ops
DEFAULT FOR TYPE int2 USING btree FAMILY integer_ops AS
@@ -1004,7 +1013,8 @@ DEFAULT FOR TYPE int2 USING btree FAMILY integer_ops AS
OPERATOR 5 > ,
FUNCTION 1 btint2cmp(int2, int2) ,
FUNCTION 2 btint2sortsupport(internal) ,
- FUNCTION 3 in_range(int2, int2, int2, boolean, boolean) ;
+ FUNCTION 3 in_range(int2, int2, int2, boolean, boolean) ,
+ FUNCTION 4 btequalimage(oid) ;
ALTER OPERATOR FAMILY integer_ops USING btree ADD
-- cross-type comparisons int8 vs int2