diff options
author | Peter Eisentraut | 2017-08-21 15:22:00 +0000 |
---|---|---|
committer | Peter Eisentraut | 2017-08-21 23:21:07 +0000 |
commit | 2bfd1b1ee562c4e4fd065c7f7d1beaa9b9852070 (patch) | |
tree | 5f22baf585a1b4aa406f48d46d85348e0ebb038b /doc/src | |
parent | 51e225da306e14616b690308a59fd89e22335035 (diff) |
Don't install ICU collation keyword variants
Users can still create them themselves. Instead, document Unicode TR 35
collation options for ICU, so users can create all this themselves.
Reviewed-by: Peter Geoghegan <pg@bowt.ie>
Diffstat (limited to 'doc/src')
-rw-r--r-- | doc/src/sgml/charset.sgml | 98 |
1 files changed, 84 insertions, 14 deletions
diff --git a/doc/src/sgml/charset.sgml b/doc/src/sgml/charset.sgml index f2a4acc1150..44e43503a61 100644 --- a/doc/src/sgml/charset.sgml +++ b/doc/src/sgml/charset.sgml @@ -665,13 +665,6 @@ SELECT a COLLATE "C" < b COLLATE "POSIX" FROM test1; </varlistentry> <varlistentry> - <term><literal>de-u-co-phonebk-x-icu</literal></term> - <listitem> - <para>German collation, phone book variant</para> - </listitem> - </varlistentry> - - <varlistentry> <term><literal>de-AT-x-icu</literal></term> <listitem> <para>German collation for Austria, default variant</para> @@ -684,13 +677,6 @@ SELECT a COLLATE "C" < b COLLATE "POSIX" FROM test1; </varlistentry> <varlistentry> - <term><literal>de-AT-u-co-phonebk-x-icu</literal></term> - <listitem> - <para>German collation for Austria, phone book variant</para> - </listitem> - </varlistentry> - - <varlistentry> <term><literal>und-x-icu</literal> (for <quote>undefined</quote>)</term> <listitem> <para> @@ -709,6 +695,90 @@ SELECT a COLLATE "C" < b COLLATE "POSIX" FROM test1; will draw an error along the lines of <quote>collation "de-x-icu" for encoding "WIN874" does not exist</>. </para> + + <para> + ICU allows collations to be customized beyond the basic language+country + set that is preloaded by <command>initdb</command>. Users are encouraged + to define their own collation objects that make use of these facilities to + suit the sorting behavior to their requirements. Here are some examples: + + <variablelist> + <varlistentry> + <term><literal>CREATE COLLATION "de-u-co-phonebk-x-icu" (provider = icu, locale = 'de-u-co-phonebk')</literal></term> + <listitem> + <para>German collation with phone book collation type</para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>CREATE COLLATION "und-u-co-emoji-x-icu" (provider = icu, locale = 'und-u-co-emoji')</literal></term> + <listitem> + <para> + Root collation with Emoji collation type, per Unicode Technical Standard #51 + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>CREATE COLLATION digitslast (provider = icu, locale = 'en-u-kr-latn-digit')</literal></term> + <listitem> + <para> + Sort digits after Latin letters. (The default is digits before letters.) + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>CREATE COLLATION upperfirst (provider = icu, locale = 'en-u-kf-upper')</literal></term> + <listitem> + <para> + Sort upper-case letters before lower-case letters. (The default is + lower-case letters first.) + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>CREATE COLLATION special (provider = icu, locale = 'en-u-kf-upper-kr-latn-digit')</literal></term> + <listitem> + <para> + Combines both of the above options. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>CREATE COLLATION numeric (provider = icu, locale = 'en-u-kn-true')</literal></term> + <listitem> + <para> + Numeric ordering, sorts sequences of digits by their numeric value, + for example: <literal>A-21</literal> < <literal>A-123</literal> + (also known as natural sort). + </para> + </listitem> + </varlistentry> + </variablelist> + + See <ulink url="http://unicode.org/reports/tr35/tr35-collation.html">Unicode + Technical Standard #35</ulink> + and <ulink url="https://tools.ietf.org/html/bcp47">BCP 47</ulink> for + details. The list of possible collation types (<literal>co</literal> + subtag) can be found in + the <ulink url="http://www.unicode.org/repos/cldr/trunk/common/bcp47/collation.xml">CLDR + repository</ulink>. + The <ulink url="https://ssl.icu-project.org/icu-bin/locexp">ICU Locale + Explorer</ulink> can be used to check the details of a particular locale + definition. + </para> + + <para> + Note that while this system allows creating collations that <quote>ignore + case</quote> or <quote>ignore accents</quote> or similar (using + the <literal>ks</literal> key), PostgreSQL does not at the moment allow + such collations to act in a truly case- or accent-insensitive manner. Any + strings that compare equal according to the collation but are not + byte-wise equal will be sorted according to their byte values. + </para> </sect4> </sect3> |