diff options
Diffstat (limited to 'doc/src')
| -rw-r--r-- | doc/src/sgml/btree.sgml | 96 | ||||
| -rw-r--r-- | doc/src/sgml/ref/alter_opfamily.sgml | 7 | ||||
| -rw-r--r-- | doc/src/sgml/ref/create_opclass.sgml | 14 | ||||
| -rw-r--r-- | doc/src/sgml/xindex.sgml | 18 |
4 files changed, 121 insertions, 14 deletions
diff --git a/doc/src/sgml/btree.sgml b/doc/src/sgml/btree.sgml index ac6c4423e60..fcf771c857f 100644 --- a/doc/src/sgml/btree.sgml +++ b/doc/src/sgml/btree.sgml @@ -207,7 +207,7 @@ <para> As shown in <xref linkend="xindex-btree-support-table"/>, btree defines - one required and two optional support functions. The three + one required and three optional support functions. The four user-defined methods are: </para> <variablelist> @@ -456,6 +456,100 @@ returns bool </para> </listitem> </varlistentry> + <varlistentry> + <term><function>equalimage</function></term> + <listitem> + <para> + Optionally, a btree operator family may provide + <function>equalimage</function> (<quote>equality implies image + equality</quote>) support functions, registered under support + function number 4. These functions allow the core code to + determine when it is safe to apply the btree deduplication + optimization. Currently, <function>equalimage</function> + functions are only called when building or rebuilding an index. + </para> + <para> + An <function>equalimage</function> function must have the + signature +<synopsis> +equalimage(<replaceable>opcintype</replaceable> <type>oid</type>) returns bool +</synopsis> + The return value is static information about an operator class + and collation. Returning <literal>true</literal> indicates that + the <function>order</function> function for the operator class is + guaranteed to only return <literal>0</literal> (<quote>arguments + are equal</quote>) when its <replaceable>A</replaceable> and + <replaceable>B</replaceable> arguments are also interchangeable + without any loss of semantic information. Not registering an + <function>equalimage</function> function or returning + <literal>false</literal> indicates that this condition cannot be + assumed to hold. + </para> + <para> + The <replaceable>opcintype</replaceable> argument is the + <literal><structname>pg_type</structname>.oid</literal> of the + data type that the operator class indexes. This is a convenience + that allows reuse of the same underlying + <function>equalimage</function> function across operator classes. + If <replaceable>opcintype</replaceable> is a collatable data + type, the appropriate collation OID will be passed to the + <function>equalimage</function> function, using the standard + <function>PG_GET_COLLATION()</function> mechanism. + </para> + <para> + As far as the operator class is concerned, returning + <literal>true</literal> indicates that deduplication is safe (or + safe for the collation whose OID was passed to its + <function>equalimage</function> function). However, the core + code will only deem deduplication safe for an index when + <emphasis>every</emphasis> indexed column uses an operator class + that registers an <function>equalimage</function> function, and + each function actually returns <literal>true</literal> when + called. + </para> + <para> + Image equality is <emphasis>almost</emphasis> the same condition + as simple bitwise equality. There is one subtle difference: When + indexing a varlena data type, the on-disk representation of two + image equal datums may not be bitwise equal due to inconsistent + application of <acronym>TOAST</acronym> compression on input. + Formally, when an operator class's + <function>equalimage</function> function returns + <literal>true</literal>, it is safe to assume that the + <literal>datum_image_eq()</literal> C function will always agree + with the operator class's <function>order</function> function + (provided that the same collation OID is passed to both the + <function>equalimage</function> and <function>order</function> + functions). + </para> + <para> + The core code is fundamentally unable to deduce anything about + the <quote>equality implies image equality</quote> status of an + operator class within a multiple-data-type family based on + details from other operator classes in the same family. Also, it + is not sensible for an operator family to register a cross-type + <function>equalimage</function> function, and attempting to do so + will result in an error. This is because <quote>equality implies + image equality</quote> status does not just depend on + sorting/equality semantics, which are more or less defined at the + operator family level. In general, the semantics that one + particular data type implements must be considered separately. + </para> + <para> + The convention followed by the operator classes included with the + core <productname>PostgreSQL</productname> distribution is to + register a stock, generic <function>equalimage</function> + function. Most operator classes register + <function>btequalimage()</function>, which indicates that + deduplication is safe unconditionally. Operator classes for + collatable data types such as <type>text</type> register + <function>btvarstrequalimage()</function>, which indicates that + deduplication is safe with deterministic collations. Best + practice for third-party extensions is to register their own + custom function to retain control. + </para> + </listitem> + </varlistentry> </variablelist> </sect1> diff --git a/doc/src/sgml/ref/alter_opfamily.sgml b/doc/src/sgml/ref/alter_opfamily.sgml index 848156c9d7d..4ac1cca95a3 100644 --- a/doc/src/sgml/ref/alter_opfamily.sgml +++ b/doc/src/sgml/ref/alter_opfamily.sgml @@ -153,9 +153,10 @@ ALTER OPERATOR FAMILY <replaceable>name</replaceable> USING <replaceable class=" and hash functions it is not necessary to specify <replaceable class="parameter">op_type</replaceable> since the function's input data type(s) are always the correct ones to use. For B-tree sort - support functions and all functions in GiST, SP-GiST and GIN operator - classes, it is necessary to specify the operand data type(s) the function - is to be used with. + support functions, B-Tree equal image functions, and all + functions in GiST, SP-GiST and GIN operator classes, it is + necessary to specify the operand data type(s) the function is to + be used with. </para> <para> diff --git a/doc/src/sgml/ref/create_opclass.sgml b/doc/src/sgml/ref/create_opclass.sgml index dd5252fd976..f42fb6494c6 100644 --- a/doc/src/sgml/ref/create_opclass.sgml +++ b/doc/src/sgml/ref/create_opclass.sgml @@ -171,12 +171,14 @@ CREATE OPERATOR CLASS <replaceable class="parameter">name</replaceable> [ DEFAUL function is intended to support, if different from the input data type(s) of the function (for B-tree comparison functions and hash functions) - or the class's data type (for B-tree sort support functions and all - functions in GiST, SP-GiST, GIN and BRIN operator classes). These defaults - are correct, and so <replaceable - class="parameter">op_type</replaceable> need not be specified in - <literal>FUNCTION</literal> clauses, except for the case of a B-tree sort - support function that is meant to support cross-data-type comparisons. + or the class's data type (for B-tree sort support functions, + B-tree equal image functions, and all functions in GiST, + SP-GiST, GIN and BRIN operator classes). These defaults are + correct, and so <replaceable + class="parameter">op_type</replaceable> need not be specified + in <literal>FUNCTION</literal> clauses, except for the case of a + B-tree sort support function that is meant to support + cross-data-type comparisons. </para> </listitem> </varlistentry> diff --git a/doc/src/sgml/xindex.sgml b/doc/src/sgml/xindex.sgml index ffb5164aaa0..2e06ad01bf5 100644 --- a/doc/src/sgml/xindex.sgml +++ b/doc/src/sgml/xindex.sgml @@ -402,7 +402,7 @@ <para> B-trees require a comparison support function, - and allow two additional support functions to be + and allow three additional support functions to be supplied at the operator class author's option, as shown in <xref linkend="xindex-btree-support-table"/>. The requirements for these support functions are explained further in @@ -441,6 +441,13 @@ </entry> <entry>3</entry> </row> + <row> + <entry> + Determine if it is safe for indexes that use the operator + class to apply the btree deduplication optimization (optional) + </entry> + <entry>4</entry> + </row> </tbody> </tgroup> </table> @@ -980,7 +987,8 @@ DEFAULT FOR TYPE int8 USING btree FAMILY integer_ops AS OPERATOR 5 > , FUNCTION 1 btint8cmp(int8, int8) , FUNCTION 2 btint8sortsupport(internal) , - FUNCTION 3 in_range(int8, int8, int8, boolean, boolean) ; + FUNCTION 3 in_range(int8, int8, int8, boolean, boolean) , + FUNCTION 4 btequalimage(oid) ; CREATE OPERATOR CLASS int4_ops DEFAULT FOR TYPE int4 USING btree FAMILY integer_ops AS @@ -992,7 +1000,8 @@ DEFAULT FOR TYPE int4 USING btree FAMILY integer_ops AS OPERATOR 5 > , FUNCTION 1 btint4cmp(int4, int4) , FUNCTION 2 btint4sortsupport(internal) , - FUNCTION 3 in_range(int4, int4, int4, boolean, boolean) ; + FUNCTION 3 in_range(int4, int4, int4, boolean, boolean) , + FUNCTION 4 btequalimage(oid) ; CREATE OPERATOR CLASS int2_ops DEFAULT FOR TYPE int2 USING btree FAMILY integer_ops AS @@ -1004,7 +1013,8 @@ DEFAULT FOR TYPE int2 USING btree FAMILY integer_ops AS OPERATOR 5 > , FUNCTION 1 btint2cmp(int2, int2) , FUNCTION 2 btint2sortsupport(internal) , - FUNCTION 3 in_range(int2, int2, int2, boolean, boolean) ; + FUNCTION 3 in_range(int2, int2, int2, boolean, boolean) , + FUNCTION 4 btequalimage(oid) ; ALTER OPERATOR FAMILY integer_ops USING btree ADD -- cross-type comparisons int8 vs int2 |
