summaryrefslogtreecommitdiff
path: root/doc/src/sgml
diff options
context:
space:
mode:
authorPeter Eisentraut2003-05-15 15:50:21 +0000
committerPeter Eisentraut2003-05-15 15:50:21 +0000
commit2c0556068fc308ed9cce06c85de7e42305d34b86 (patch)
tree2c9f5561bdc7b660e2fbc5eb2dd67a24f7a654f8 /doc/src/sgml
parent2a2f6cfa3983e6834299857c80bc07d32d1e019a (diff)
Indexing support for pattern matching operations via separate operator
class when lc_collate is not C.
Diffstat (limited to 'doc/src/sgml')
-rw-r--r--doc/src/sgml/charset.sgml16
-rw-r--r--doc/src/sgml/indices.sgml45
-rw-r--r--doc/src/sgml/release.sgml3
-rw-r--r--doc/src/sgml/runtime.sgml29
4 files changed, 57 insertions, 36 deletions
diff --git a/doc/src/sgml/charset.sgml b/doc/src/sgml/charset.sgml
index 8c2af90770d..213f3a8f62b 100644
--- a/doc/src/sgml/charset.sgml
+++ b/doc/src/sgml/charset.sgml
@@ -1,4 +1,4 @@
-<!-- $Header: /cvsroot/pgsql/doc/src/sgml/charset.sgml,v 2.35 2003/04/15 13:26:54 petere Exp $ -->
+<!-- $Header: /cvsroot/pgsql/doc/src/sgml/charset.sgml,v 2.36 2003/05/15 15:50:18 petere Exp $ -->
<chapter id="charset">
<title>Localization</>
@@ -213,23 +213,13 @@ initdb --locale=sv_SE
The <function>to_char</> family of functions
</para>
</listitem>
-
- <listitem>
- <para>
- The <literal>LIKE</> and <literal>~</> operators for pattern
- matching
- </para>
- </listitem>
</itemizedlist>
</para>
<para>
The only severe drawback of using the locale support in
- <productname>PostgreSQL</> is its speed. So use locales only if you
- actually need it. It should be noted in particular that selecting
- a non-C locale disables index optimizations for <literal>LIKE</> and
- <literal>~</> operators, which can make a huge difference in the
- speed of searches that use those operators.
+ <productname>PostgreSQL</> is its speed. So use locales only if
+ you actually need them.
</para>
</sect2>
diff --git a/doc/src/sgml/indices.sgml b/doc/src/sgml/indices.sgml
index d900b941fa9..fcd7108a14c 100644
--- a/doc/src/sgml/indices.sgml
+++ b/doc/src/sgml/indices.sgml
@@ -1,4 +1,4 @@
-<!-- $Header: /cvsroot/pgsql/doc/src/sgml/indices.sgml,v 1.40 2003/03/25 16:15:36 petere Exp $ -->
+<!-- $Header: /cvsroot/pgsql/doc/src/sgml/indices.sgml,v 1.41 2003/05/15 15:50:18 petere Exp $ -->
<chapter id="indexes">
<title id="indexes-title">Indexes</title>
@@ -133,6 +133,19 @@ CREATE INDEX test1_id_index ON test1 (id);
</para>
<para>
+ The optimizer can also use a B-tree index for queries involving the
+ pattern matching operators <literal>LIKE</>,
+ <literal>ILIKE</literal>, <literal>~</literal>, and
+ <literal>~*</literal>, <emphasis>if</emphasis> the pattern is
+ anchored to the beginning of the string, e.g., <literal>col LIKE
+ 'foo%'</literal> or <literal>col ~ '^foo'</literal>, but not
+ <literal>col LIKE '%bar'</literal>. However, if your server does
+ not use the C locale you will need to create the index with a
+ special operator class. See <xref linkend="indexes-opclass">
+ below.
+ </para>
+
+ <para>
<indexterm>
<primary>indexes</primary>
<secondary>R-tree</secondary>
@@ -405,6 +418,36 @@ CREATE INDEX <replaceable>name</replaceable> ON <replaceable>table</replaceable>
<literal>bigbox_ops</literal>.
</para>
</listitem>
+
+ <listitem>
+ <para>
+ The operator classes <literal>text_pattern_ops</literal>,
+ <literal>varchar_pattern_ops</literal>,
+ <literal>bpchar_pattern_ops</literal>, and
+ <literal>name_pattern_ops</literal> support B-tree indexes on
+ the types <type>text</type>, <type>varchar</type>,
+ <type>char</type>, and <type>name</type>, respectively. The
+ difference to the ordinary operator classes is that the values
+ are compared strictly character by character rather than
+ according to the locale-specific collation rules. This makes
+ these operator classes suitable for use by queries involving
+ pattern matching expressions (<literal>LIKE</literal> or POSIX
+ regular expressions) if the server does not use the standard
+ <quote>C</quote> locale. As an example, to index a
+ <type>varchar</type> column like this:
+<programlisting>
+CREATE INDEX test_index ON test_table (col varchar_pattern_ops);
+</programlisting>
+ If you do use the C locale, you should instead create an index
+ with the default operator class. Also note that you should
+ create an index with the default operator class if you want
+ queries involving ordinary comparisons to use an index. Such
+ queries cannot use the
+ <literal><replaceable>xxx</replaceable>_pattern_ops</literal>
+ operator classes. It is possible, however, to create multiple
+ indexes on the same column with different operator classes.
+ </para>
+ </listitem>
</itemizedlist>
</para>
diff --git a/doc/src/sgml/release.sgml b/doc/src/sgml/release.sgml
index 9332ac499b4..9d9b758e89c 100644
--- a/doc/src/sgml/release.sgml
+++ b/doc/src/sgml/release.sgml
@@ -1,5 +1,5 @@
<!--
-$Header: /cvsroot/pgsql/doc/src/sgml/release.sgml,v 1.187 2003/05/14 03:25:59 tgl Exp $
+$Header: /cvsroot/pgsql/doc/src/sgml/release.sgml,v 1.188 2003/05/15 15:50:18 petere Exp $
-->
<appendix id="release">
@@ -24,6 +24,7 @@ CDATA means the content is "SGML-free", so you can write without
worries about funny characters.
-->
<literallayout><![CDATA[
+Pattern matching operations can use indexes regardless of locale
New frontend/backend protocol supports many long-requested features
SET AUTOCOMMIT TO OFF is no longer supported
Reimplementation of NUMERIC datatype for more speed
diff --git a/doc/src/sgml/runtime.sgml b/doc/src/sgml/runtime.sgml
index b79f8cff62e..f20a8931e3d 100644
--- a/doc/src/sgml/runtime.sgml
+++ b/doc/src/sgml/runtime.sgml
@@ -1,5 +1,5 @@
<!--
-$Header: /cvsroot/pgsql/doc/src/sgml/runtime.sgml,v 1.179 2003/05/14 03:26:00 tgl Exp $
+$Header: /cvsroot/pgsql/doc/src/sgml/runtime.sgml,v 1.180 2003/05/15 15:50:18 petere Exp $
-->
<Chapter Id="runtime">
@@ -133,26 +133,13 @@ postgres$ <userinput>initdb -D /usr/local/pgsql/data</userinput>
</para>
<para>
- <command>initdb</command> also initializes the default locale<indexterm><primary>locale</></> for
- the database cluster. Normally, it will just take the locale
- settings in the environment and apply them to the initialized
- database. It is possible to specify a different locale for the
- database; more information about that can be found in <xref
- linkend="locale">. One surprise you might encounter while running
- <command>initdb</command> is a notice similar to this:
-<screen>
-The database cluster will be initialized with locale de_DE.
-This locale setting will prevent the use of indexes for pattern matching
-operations. If that is a concern, rerun initdb with the collation order
-set to "C". For more information see the documentation.
-</screen>
- This is intended to warn you that the currently selected locale
- will cause indexes to be sorted in an order that prevents them from
- being used for <literal>LIKE</> and regular-expression searches. If you need
- good performance in such searches, you should set your current
- locale to <literal>C</> and re-run <command>initdb</command>, e.g.,
- by running <literal>initdb --lc-collate=C</literal>. The sort
- order used within a particular database cluster is set by
+ <command>initdb</command> also initializes the default
+ locale<indexterm><primary>locale</></> for the database cluster.
+ Normally, it will just take the locale settings in the environment
+ and apply them to the initialized database. It is possible to
+ specify a different locale for the database; more information about
+ that can be found in <xref linkend="locale">. The sort order used
+ within a particular database cluster is set by
<command>initdb</command> and cannot be changed later, short of
dumping all data, rerunning <command>initdb</command>, and
reloading the data. So it's important to make this choice correctly