</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>colliculocale</structfield> <type>text</type>
+ </para>
+ <para>
+ ICU locale ID for this collation object
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>collversion</structfield> <type>text</type>
</para>
</sect2>
+ <sect2>
+ <title>Selecting Locales</title>
+
+ <para>
+ Locales can be selected in different scopes depending on requirements.
+ The above overview showed how locales are specified using
+ <command>initdb</command> to set the defaults for the entire cluster. The
+ following list shows where locales can be selected. Each item provides
+ the defaults for the subsequent items, and each lower item allows
+ overriding the defaults on a finer granularity.
+ </para>
+
+ <orderedlist>
+ <listitem>
+ <para>
+ As explained above, the environment of the operating system provides the
+ defaults for the locales of a newly initialized database cluster. In
+ many cases, this is enough: If the operating system is configured for
+ the desired language/territory, then
+ <productname>PostgreSQL</productname> will by default also behave
+ according to that locale.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ As shown above, command-line options for <command>initdb</command>
+ specify the locale settings for a newly initialized database cluster.
+ Use this if the operating system does not have the locale configuration
+ you want for your database system.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ A locale can be selected separately for each database. The SQL command
+ <command>CREATE DATABASE</command> and its command-line equivalent
+ <command>createdb</command> have options for that. Use this for example
+ if a database cluster houses databases for multiple tennants with
+ different requirements.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Locale settings can be made for individual table columns. This uses an
+ SQL object called <firstterm>collation</firstterm> and is explained in
+ <xref linkend="collation"/>. Use this for example to sort data in
+ different languages or customize the sort order of a particular table.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Finally, locales can be selected for an individual query. Again, this
+ uses SQL collation objects. This could be used to change the sort order
+ based on run-time choices or for ad-hoc experimentation.
+ </para>
+ </listitem>
+ </orderedlist>
+ </sect2>
+
+ <sect2>
+ <title>Locale Providers</title>
+
+ <para>
+ <productname>PostgreSQL</productname> supports multiple <firstterm>locale
+ providers</firstterm>. This specifies which library supplies the locale
+ data. One standard provider name is <literal>libc</literal>, which uses
+ the locales provided by the operating system C library. These are the
+ locales that most tools provided by the operating system use. Another
+ provider is <literal>icu</literal>, which uses the external
+ ICU<indexterm><primary>ICU</primary></indexterm> library. ICU locales can
+ only be used if support for ICU was configured when PostgreSQL was built.
+ </para>
+
+ <para>
+ The commands and tools that select the locale settings, as described
+ above, each have an option to select the locale provider. The examples
+ shown earlier all use the <literal>libc</literal> provider, which is the
+ default. Here is an example to initialize a database cluster using the
+ ICU provider:
+<programlisting>
+initdb --locale-provider=icu --icu-locale=en
+</programlisting>
+ See the description of the respective commands and programs for the
+ respective details. Note that you can mix locale providers on different
+ granularities, for example use <literal>libc</literal> by default for the
+ cluster but have one database that uses the <literal>icu</literal>
+ provider, and then have collation objects using either provider within
+ those databases.
+ </para>
+
+ <para>
+ Which locale provider to use depends on individual requirements. For most
+ basic uses, either provider will give adequate results. For the libc
+ provider, it depends on what the operating system offers; some operating
+ systems are better than others. For advanced uses, ICU offers more locale
+ variants and customization options.
+ </para>
+ </sect2>
+
<sect2>
<title>Problems</title>
[ LOCALE [=] <replaceable class="parameter">locale</replaceable> ]
[ LC_COLLATE [=] <replaceable class="parameter">lc_collate</replaceable> ]
[ LC_CTYPE [=] <replaceable class="parameter">lc_ctype</replaceable> ]
+ [ ICU_LOCALE [=] <replaceable class="parameter">icu_locale</replaceable> ]
+ [ LOCALE_PROVIDER [=] <replaceable class="parameter">locale_provider</replaceable> ]
[ COLLATION_VERSION = <replaceable>collation_version</replaceable> ]
[ TABLESPACE [=] <replaceable class="parameter">tablespace_name</replaceable> ]
[ ALLOW_CONNECTIONS [=] <replaceable class="parameter">allowconn</replaceable> ]
</listitem>
</varlistentry>
+ <varlistentry>
+ <term><replaceable class="parameter">icu_locale</replaceable></term>
+ <listitem>
+ <para>
+ Specifies the ICU locale ID if the ICU locale provider is used.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable>locale_provider</replaceable></term>
+
+ <listitem>
+ <para>
+ Specifies the provider to use for the default collation in this
+ database. Possible values are:
+ <literal>icu</literal>,<indexterm><primary>ICU</primary></indexterm>
+ <literal>libc</literal>. <literal>libc</literal> is the default. The
+ available choices depend on the operating system and build options.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry>
<term><replaceable>collation_version</replaceable></term>
indexes that would be affected.
</para>
+ <para>
+ There is currently no option to use a database locale with nondeterministic
+ comparisons (see <link linkend="sql-createcollation"><command>CREATE
+ COLLATION</command></link> for an explanation). If this is needed, then
+ per-column collations would need to be used.
+ </para>
+
<para>
The <literal>CONNECTION LIMIT</literal> option is only enforced approximately;
if two new sessions start at about the same time when just one
</listitem>
</varlistentry>
+ <varlistentry>
+ <term><option>--icu-locale=<replaceable class="parameter">locale</replaceable></option></term>
+ <listitem>
+ <para>
+ Specifies the ICU locale ID to be used in this database, if the
+ ICU locale provider is selected.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><option>--locale-provider={<literal>libc</literal>|<literal>icu</literal>}</option></term>
+ <listitem>
+ <para>
+ Specifies the locale provider for the database's default collation.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry>
<term><option>-O <replaceable class="parameter">owner</replaceable></option></term>
<term><option>--owner=<replaceable class="parameter">owner</replaceable></option></term>
</para>
<para>
- <command>initdb</command> initializes the database cluster's default
- locale and character set encoding. The character set encoding,
- collation order (<literal>LC_COLLATE</literal>) and character set classes
- (<literal>LC_CTYPE</literal>, e.g., upper, lower, digit) can be set separately
- for a database when it is created. <command>initdb</command> determines
- those settings for the template databases, which will
- serve as the default for all other databases.
+ <command>initdb</command> initializes the database cluster's default locale
+ and character set encoding. These can also be set separately for each
+ database when it is created. <command>initdb</command> determines those
+ settings for the template databases, which will serve as the default for
+ all other databases. By default, <command>initdb</command> uses the
+ locale provider <literal>libc</literal>, takes the locale settings from
+ the environment, and determines the encoding from the locale settings.
+ This is almost always sufficient, unless there are special requirements.
</para>
<para>
- To alter the default collation order or character set classes, use the
- <option>--lc-collate</option> and <option>--lc-ctype</option> options.
- Collation orders other than <literal>C</literal> or <literal>POSIX</literal> also have
- a performance penalty. For these reasons it is important to choose the
- right locale when running <command>initdb</command>.
+ To choose a different locale for the cluster, use the option
+ <option>--locale</option>. There are also individual options
+ <option>--lc-*</option> (see below) to set values for the individual locale
+ categories. Note that inconsistent settings for different locale
+ categories can give nonsensical results, so this should be used with care.
</para>
<para>
- The remaining locale categories can be changed later when the server
- is started. You can also use <option>--locale</option> to set the
- default for all locale categories, including collation order and
- character set classes. All server locale values (<literal>lc_*</literal>) can
- be displayed via <command>SHOW ALL</command>.
- More details can be found in <xref linkend="locale"/>.
+ Alternatively, the ICU library can be used to provide locale services.
+ (Again, this only sets the default for subsequently created databases.) To
+ select this option, specify <literal>--locale-provider=icu</literal>.
+ To chose the specific ICU locale ID to apply, use the option
+ <option>--icu-locale</option>. Note that
+ for implementation reasons and to support legacy code,
+ <command>initdb</command> will still select and initialize libc locale
+ settings when the ICU locale provider is used.
+ </para>
+
+ <para>
+ When <command>initdb</command> runs, it will print out the locale settings
+ it has chosen. If you have complex requirements or specified multiple
+ options, it is advisable to check that the result matches what was
+ intended.
+ </para>
+
+ <para>
+ More details about locale settings can be found in <xref
+ linkend="locale"/>.
</para>
<para>
</listitem>
</varlistentry>
+ <varlistentry>
+ <term><option>--icu-locale=<replaceable>locale</replaceable></option></term>
+ <listitem>
+ <para>
+ Specifies the ICU locale ID, if the ICU locale provider is used.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="app-initdb-data-checksums" xreflabel="data checksums">
<term><option>-k</option></term>
<term><option>--data-checksums</option></term>
</listitem>
</varlistentry>
+ <varlistentry>
+ <term><option>--locale-provider={<literal>libc</literal>|<literal>icu</literal>}</option></term>
+ <listitem>
+ <para>
+ This option sets the locale provider for databases created in the
+ new cluster. It can be overridden in the <command>CREATE
+ DATABASE</command> command when new databases are subsequently
+ created. The default is <literal>libc</literal>.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry>
<term><option>-N</option></term>
<term><option>--no-sync</option></term>
bool collisdeterministic,
int32 collencoding,
const char *collcollate, const char *collctype,
+ const char *colliculocale,
const char *collversion,
bool if_not_exists,
bool quiet)
AssertArg(collname);
AssertArg(collnamespace);
AssertArg(collowner);
- AssertArg(collcollate);
- AssertArg(collctype);
+ AssertArg((collcollate && collctype) || colliculocale);
/*
* Make sure there is no existing collation of same name & encoding.
values[Anum_pg_collation_collprovider - 1] = CharGetDatum(collprovider);
values[Anum_pg_collation_collisdeterministic - 1] = BoolGetDatum(collisdeterministic);
values[Anum_pg_collation_collencoding - 1] = Int32GetDatum(collencoding);
- values[Anum_pg_collation_collcollate - 1] = CStringGetTextDatum(collcollate);
- values[Anum_pg_collation_collctype - 1] = CStringGetTextDatum(collctype);
+ if (collcollate)
+ values[Anum_pg_collation_collcollate - 1] = CStringGetTextDatum(collcollate);
+ else
+ nulls[Anum_pg_collation_collcollate - 1] = true;
+ if (collctype)
+ values[Anum_pg_collation_collctype - 1] = CStringGetTextDatum(collctype);
+ else
+ nulls[Anum_pg_collation_collctype - 1] = true;
+ if (colliculocale)
+ values[Anum_pg_collation_colliculocale - 1] = CStringGetTextDatum(colliculocale);
+ else
+ nulls[Anum_pg_collation_colliculocale - 1] = true;
if (collversion)
values[Anum_pg_collation_collversion - 1] = CStringGetTextDatum(collversion);
else
DefElem *versionEl = NULL;
char *collcollate;
char *collctype;
+ char *colliculocale;
bool collisdeterministic;
int collencoding;
char collprovider;
else
collctype = NULL;
+ datum = SysCacheGetAttr(COLLOID, tp, Anum_pg_collation_colliculocale, &isnull);
+ if (!isnull)
+ colliculocale = TextDatumGetCString(datum);
+ else
+ colliculocale = NULL;
+
ReleaseSysCache(tp);
/*
collcollate = NULL;
collctype = NULL;
-
- if (localeEl)
- {
- collcollate = defGetString(localeEl);
- collctype = defGetString(localeEl);
- }
-
- if (lccollateEl)
- collcollate = defGetString(lccollateEl);
-
- if (lcctypeEl)
- collctype = defGetString(lcctypeEl);
+ colliculocale = NULL;
if (providerEl)
collproviderstr = defGetString(providerEl);
else
collprovider = COLLPROVIDER_LIBC;
- if (!collcollate)
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
- errmsg("parameter \"lc_collate\" must be specified")));
+ if (localeEl)
+ {
+ if (collprovider == COLLPROVIDER_LIBC)
+ {
+ collcollate = defGetString(localeEl);
+ collctype = defGetString(localeEl);
+ }
+ else
+ colliculocale = defGetString(localeEl);
+ }
- if (!collctype)
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
- errmsg("parameter \"lc_ctype\" must be specified")));
+ if (lccollateEl)
+ collcollate = defGetString(lccollateEl);
+
+ if (lcctypeEl)
+ collctype = defGetString(lcctypeEl);
+
+ if (collprovider == COLLPROVIDER_LIBC)
+ {
+ if (!collcollate)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
+ errmsg("parameter \"lc_collate\" must be specified")));
+
+ if (!collctype)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
+ errmsg("parameter \"lc_ctype\" must be specified")));
+ }
+ else if (collprovider == COLLPROVIDER_ICU)
+ {
+ if (!colliculocale)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
+ errmsg("parameter \"locale\" must be specified")));
+ }
/*
* Nondeterministic collations are currently only supported with ICU
}
if (!collversion)
- collversion = get_collation_actual_version(collprovider, collcollate);
+ collversion = get_collation_actual_version(collprovider, collprovider == COLLPROVIDER_ICU ? colliculocale : collcollate);
newoid = CollationCreate(collName,
collNamespace,
collencoding,
collcollate,
collctype,
+ colliculocale,
collversion,
if_not_exists,
false); /* not quiet */
datum = SysCacheGetAttr(COLLOID, tup, Anum_pg_collation_collversion, &isnull);
oldversion = isnull ? NULL : TextDatumGetCString(datum);
- datum = SysCacheGetAttr(COLLOID, tup, Anum_pg_collation_collcollate, &isnull);
- Assert(!isnull);
+ datum = SysCacheGetAttr(COLLOID, tup, collForm->collprovider == COLLPROVIDER_ICU ? Anum_pg_collation_colliculocale : Anum_pg_collation_collcollate, &isnull);
+ if (isnull)
+ elog(ERROR, "unexpected null in pg_collation");
newversion = get_collation_actual_version(collForm->collprovider, TextDatumGetCString(datum));
/* cannot change from NULL to non-NULL or vice versa */
collprovider = ((Form_pg_collation) GETSTRUCT(tp))->collprovider;
- datum = SysCacheGetAttr(COLLOID, tp, Anum_pg_collation_collcollate, &isnull);
- Assert(!isnull);
- version = get_collation_actual_version(collprovider, TextDatumGetCString(datum));
+ if (collprovider != COLLPROVIDER_DEFAULT)
+ {
+ datum = SysCacheGetAttr(COLLOID, tp, collprovider == COLLPROVIDER_ICU ? Anum_pg_collation_colliculocale : Anum_pg_collation_collcollate, &isnull);
+ if (isnull)
+ elog(ERROR, "unexpected null in pg_collation");
+ version = get_collation_actual_version(collprovider, TextDatumGetCString(datum));
+ }
+ else
+ version = NULL;
ReleaseSysCache(tp);
*/
collid = CollationCreate(localebuf, nspid, GetUserId(),
COLLPROVIDER_LIBC, true, enc,
- localebuf, localebuf,
+ localebuf, localebuf, NULL,
get_collation_actual_version(COLLPROVIDER_LIBC, localebuf),
true, true);
if (OidIsValid(collid))
collid = CollationCreate(alias, nspid, GetUserId(),
COLLPROVIDER_LIBC, true, enc,
- locale, locale,
+ locale, locale, NULL,
get_collation_actual_version(COLLPROVIDER_LIBC, locale),
true, true);
if (OidIsValid(collid))
const char *name;
char *langtag;
char *icucomment;
- const char *collcollate;
+ const char *iculocstr;
Oid collid;
if (i == -1)
name = uloc_getAvailable(i);
langtag = get_icu_language_tag(name);
- collcollate = U_ICU_VERSION_MAJOR_NUM >= 54 ? langtag : name;
+ iculocstr = U_ICU_VERSION_MAJOR_NUM >= 54 ? langtag : name;
/*
* Be paranoid about not allowing any non-ASCII strings into
* pg_collation
*/
- if (!pg_is_ascii(langtag) || !pg_is_ascii(collcollate))
+ if (!pg_is_ascii(langtag) || !pg_is_ascii(iculocstr))
continue;
collid = CollationCreate(psprintf("%s-x-icu", langtag),
nspid, GetUserId(),
COLLPROVIDER_ICU, true, -1,
- collcollate, collcollate,
- get_collation_actual_version(COLLPROVIDER_ICU, collcollate),
+ NULL, NULL, iculocstr,
+ get_collation_actual_version(COLLPROVIDER_ICU, iculocstr),
true, true);
if (OidIsValid(collid))
{
Oid *dbIdP, Oid *ownerIdP,
int *encodingP, bool *dbIsTemplateP, bool *dbAllowConnP,
TransactionId *dbFrozenXidP, MultiXactId *dbMinMultiP,
- Oid *dbTablespace, char **dbCollate, char **dbCtype,
+ Oid *dbTablespace, char **dbCollate, char **dbCtype, char **dbIculocale,
+ char *dbLocProvider,
char **dbCollversion);
static bool have_createdb_privilege(void);
static void remove_dbtablespaces(Oid db_id);
int src_encoding = -1;
char *src_collate = NULL;
char *src_ctype = NULL;
+ char *src_iculocale = NULL;
+ char src_locprovider;
char *src_collversion = NULL;
bool src_istemplate;
bool src_allowconn;
DefElem *dlocale = NULL;
DefElem *dcollate = NULL;
DefElem *dctype = NULL;
+ DefElem *diculocale = NULL;
+ DefElem *dlocprovider = NULL;
DefElem *distemplate = NULL;
DefElem *dallowconnections = NULL;
DefElem *dconnlimit = NULL;
const char *dbtemplate = NULL;
char *dbcollate = NULL;
char *dbctype = NULL;
+ char *dbiculocale = NULL;
+ char dblocprovider = '\0';
char *canonname;
int encoding = -1;
bool dbistemplate = false;
errorConflictingDefElem(defel, pstate);
dctype = defel;
}
+ else if (strcmp(defel->defname, "icu_locale") == 0)
+ {
+ if (diculocale)
+ errorConflictingDefElem(defel, pstate);
+ diculocale = defel;
+ }
+ else if (strcmp(defel->defname, "locale_provider") == 0)
+ {
+ if (dlocprovider)
+ errorConflictingDefElem(defel, pstate);
+ dlocprovider = defel;
+ }
else if (strcmp(defel->defname, "is_template") == 0)
{
if (distemplate)
parser_errposition(pstate, defel->location)));
}
- if (dlocale && (dcollate || dctype))
- ereport(ERROR,
- (errcode(ERRCODE_SYNTAX_ERROR),
- errmsg("conflicting or redundant options"),
- errdetail("LOCALE cannot be specified together with LC_COLLATE or LC_CTYPE.")));
-
if (downer && downer->arg)
dbowner = defGetString(downer);
if (dtemplate && dtemplate->arg)
dbcollate = defGetString(dcollate);
if (dctype && dctype->arg)
dbctype = defGetString(dctype);
+ if (diculocale && diculocale->arg)
+ dbiculocale = defGetString(diculocale);
+ if (dlocprovider && dlocprovider->arg)
+ {
+ char *locproviderstr = defGetString(dlocprovider);
+
+ if (pg_strcasecmp(locproviderstr, "icu") == 0)
+ dblocprovider = COLLPROVIDER_ICU;
+ else if (pg_strcasecmp(locproviderstr, "libc") == 0)
+ dblocprovider = COLLPROVIDER_LIBC;
+ else
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
+ errmsg("unrecognized locale provider: %s",
+ locproviderstr)));
+ }
+ if (diculocale && dblocprovider != COLLPROVIDER_ICU)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
+ errmsg("ICU locale cannot be specified unless locale provider is ICU")));
+ if (dblocprovider == COLLPROVIDER_ICU && !dbiculocale)
+ {
+ if (dlocale && dlocale->arg)
+ dbiculocale = defGetString(dlocale);
+ }
if (distemplate && distemplate->arg)
dbistemplate = defGetBoolean(distemplate);
if (dallowconnections && dallowconnections->arg)
&src_dboid, &src_owner, &src_encoding,
&src_istemplate, &src_allowconn,
&src_frozenxid, &src_minmxid, &src_deftablespace,
- &src_collate, &src_ctype, &src_collversion))
+ &src_collate, &src_ctype, &src_iculocale, &src_locprovider,
+ &src_collversion))
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_DATABASE),
errmsg("template database \"%s\" does not exist",
dbcollate = src_collate;
if (dbctype == NULL)
dbctype = src_ctype;
+ if (dbiculocale == NULL)
+ dbiculocale = src_iculocale;
+ if (dblocprovider == '\0')
+ dblocprovider = src_locprovider;
/* Some encodings are client only */
if (!PG_VALID_BE_ENCODING(encoding))
check_encoding_locale_matches(encoding, dbcollate, dbctype);
+ if (dblocprovider == COLLPROVIDER_ICU)
+ {
+ /*
+ * This would happen if template0 uses the libc provider but the new
+ * database uses icu.
+ */
+ if (!dbiculocale)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("ICU locale must be specified")));
+ }
+
+ if (dblocprovider == COLLPROVIDER_ICU)
+ {
+#ifdef USE_ICU
+ UErrorCode status;
+
+ status = U_ZERO_ERROR;
+ ucol_open(dbiculocale, &status);
+ if (U_FAILURE(status))
+ ereport(ERROR,
+ (errmsg("could not open collator for locale \"%s\": %s",
+ dbiculocale, u_errorName(status))));
+#else
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("ICU is not supported in this build"), \
+ errhint("You need to rebuild PostgreSQL using %s.", "--with-icu")));
+#endif
+ }
+
/*
* Check that the new encoding and locale settings match the source
* database. We insist on this because we simply copy the source data ---
errmsg("new LC_CTYPE (%s) is incompatible with the LC_CTYPE of the template database (%s)",
dbctype, src_ctype),
errhint("Use the same LC_CTYPE as in the template database, or use template0 as template.")));
+
+ if (dblocprovider != src_locprovider)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("new locale provider (%s) does not match locale provider of the template database (%s)",
+ collprovider_name(dblocprovider), collprovider_name(src_locprovider)),
+ errhint("Use the same locale provider as in the template database, or use template0 as template.")));
+
+ if (dblocprovider == COLLPROVIDER_ICU)
+ {
+ Assert(dbiculocale);
+ Assert(src_iculocale);
+ if (strcmp(dbiculocale, src_iculocale) != 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("new ICU locale (%s) is incompatible with the ICU locale of the template database (%s)",
+ dbiculocale, src_iculocale),
+ errhint("Use the same ICU locale as in the template database, or use template0 as template.")));
+ }
}
/*
{
char *actual_versionstr;
- actual_versionstr = get_collation_actual_version(COLLPROVIDER_LIBC, dbcollate);
+ actual_versionstr = get_collation_actual_version(dblocprovider, dblocprovider == COLLPROVIDER_ICU ? dbiculocale : dbcollate);
if (!actual_versionstr)
ereport(ERROR,
(errmsg("template database \"%s\" has a collation version, but no actual collation version could be determined",
* collation version, which is normally only the case for template0.
*/
if (dbcollversion == NULL)
- dbcollversion = get_collation_actual_version(COLLPROVIDER_LIBC, dbcollate);
+ dbcollversion = get_collation_actual_version(dblocprovider, dblocprovider == COLLPROVIDER_ICU ? dbiculocale : dbcollate);
/* Resolve default tablespace for new database */
if (dtablespacename && dtablespacename->arg)
* block on the unique index, and fail after we commit).
*/
+ Assert((dblocprovider == COLLPROVIDER_ICU && dbiculocale) ||
+ (dblocprovider != COLLPROVIDER_ICU && !dbiculocale));
+
/* Form tuple */
MemSet(new_record, 0, sizeof(new_record));
MemSet(new_record_nulls, false, sizeof(new_record_nulls));
DirectFunctionCall1(namein, CStringGetDatum(dbname));
new_record[Anum_pg_database_datdba - 1] = ObjectIdGetDatum(datdba);
new_record[Anum_pg_database_encoding - 1] = Int32GetDatum(encoding);
+ new_record[Anum_pg_database_datlocprovider - 1] = CharGetDatum(dblocprovider);
new_record[Anum_pg_database_datistemplate - 1] = BoolGetDatum(dbistemplate);
new_record[Anum_pg_database_datallowconn - 1] = BoolGetDatum(dballowconnections);
new_record[Anum_pg_database_datconnlimit - 1] = Int32GetDatum(dbconnlimit);
new_record[Anum_pg_database_dattablespace - 1] = ObjectIdGetDatum(dst_deftablespace);
new_record[Anum_pg_database_datcollate - 1] = CStringGetTextDatum(dbcollate);
new_record[Anum_pg_database_datctype - 1] = CStringGetTextDatum(dbctype);
+ if (dbiculocale)
+ new_record[Anum_pg_database_daticulocale - 1] = CStringGetTextDatum(dbiculocale);
+ else
+ new_record_nulls[Anum_pg_database_daticulocale - 1] = true;
if (dbcollversion)
new_record[Anum_pg_database_datcollversion - 1] = CStringGetTextDatum(dbcollversion);
else
pgdbrel = table_open(DatabaseRelationId, RowExclusiveLock);
if (!get_db_info(dbname, AccessExclusiveLock, &db_id, NULL, NULL,
- &db_istemplate, NULL, NULL, NULL, NULL, NULL, NULL, NULL))
+ &db_istemplate, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL))
{
if (!missing_ok)
{
rel = table_open(DatabaseRelationId, RowExclusiveLock);
if (!get_db_info(oldname, AccessExclusiveLock, &db_id, NULL, NULL,
- NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL))
+ NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL))
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_DATABASE),
errmsg("database \"%s\" does not exist", oldname)));
pgdbrel = table_open(DatabaseRelationId, RowExclusiveLock);
if (!get_db_info(dbname, AccessExclusiveLock, &db_id, NULL, NULL,
- NULL, NULL, NULL, NULL, &src_tblspcoid, NULL, NULL, NULL))
+ NULL, NULL, NULL, NULL, &src_tblspcoid, NULL, NULL, NULL, NULL, NULL))
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_DATABASE),
errmsg("database \"%s\" does not exist", dbname)));
datum = heap_getattr(tuple, Anum_pg_database_datcollversion, RelationGetDescr(rel), &isnull);
oldversion = isnull ? NULL : TextDatumGetCString(datum);
- datum = heap_getattr(tuple, Anum_pg_database_datcollate, RelationGetDescr(rel), &isnull);
- Assert(!isnull);
- newversion = get_collation_actual_version(COLLPROVIDER_LIBC, TextDatumGetCString(datum));
+ datum = heap_getattr(tuple, datForm->datlocprovider == COLLPROVIDER_ICU ? Anum_pg_database_daticulocale : Anum_pg_database_datcollate, RelationGetDescr(rel), &isnull);
+ if (isnull)
+ elog(ERROR, "unexpected null in pg_database");
+ newversion = get_collation_actual_version(datForm->datlocprovider, TextDatumGetCString(datum));
/* cannot change from NULL to non-NULL or vice versa */
if ((!oldversion && newversion) || (oldversion && !newversion))
{
Oid dbid = PG_GETARG_OID(0);
HeapTuple tp;
+ char datlocprovider;
Datum datum;
bool isnull;
char *version;
(errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("database with OID %u does not exist", dbid)));
- datum = SysCacheGetAttr(DATABASEOID, tp, Anum_pg_database_datcollate, &isnull);
- Assert(!isnull);
- version = get_collation_actual_version(COLLPROVIDER_LIBC, TextDatumGetCString(datum));
+ datlocprovider = ((Form_pg_database) GETSTRUCT(tp))->datlocprovider;
+
+ datum = SysCacheGetAttr(DATABASEOID, tp, datlocprovider == COLLPROVIDER_ICU ? Anum_pg_database_daticulocale : Anum_pg_database_datcollate, &isnull);
+ if (isnull)
+ elog(ERROR, "unexpected null in pg_database");
+ version = get_collation_actual_version(datlocprovider, TextDatumGetCString(datum));
ReleaseSysCache(tp);
Oid *dbIdP, Oid *ownerIdP,
int *encodingP, bool *dbIsTemplateP, bool *dbAllowConnP,
TransactionId *dbFrozenXidP, MultiXactId *dbMinMultiP,
- Oid *dbTablespace, char **dbCollate, char **dbCtype,
+ Oid *dbTablespace, char **dbCollate, char **dbCtype, char **dbIculocale,
+ char *dbLocProvider,
char **dbCollversion)
{
bool result = false;
if (dbTablespace)
*dbTablespace = dbform->dattablespace;
/* default locale settings for this database */
+ if (dbLocProvider)
+ *dbLocProvider = dbform->datlocprovider;
if (dbCollate)
{
datum = SysCacheGetAttr(DATABASEOID, tuple, Anum_pg_database_datcollate, &isnull);
Assert(!isnull);
*dbCtype = TextDatumGetCString(datum);
}
+ if (dbIculocale)
+ {
+ datum = SysCacheGetAttr(DATABASEOID, tuple, Anum_pg_database_daticulocale, &isnull);
+ if (isnull)
+ *dbIculocale = NULL;
+ else
+ *dbIculocale = TextDatumGetCString(datum);
+ }
if (dbCollversion)
{
datum = SysCacheGetAttr(DATABASEOID, tuple, Anum_pg_database_datcollversion, &isnull);
{
/* Attempt to set the flags */
HeapTuple tp;
- Datum datum;
- bool isnull;
- const char *collcollate;
- const char *collctype;
+ Form_pg_collation collform;
tp = SearchSysCache1(COLLOID, ObjectIdGetDatum(collation));
if (!HeapTupleIsValid(tp))
elog(ERROR, "cache lookup failed for collation %u", collation);
+ collform = (Form_pg_collation) GETSTRUCT(tp);
- datum = SysCacheGetAttr(COLLOID, tp, Anum_pg_collation_collcollate, &isnull);
- Assert(!isnull);
- collcollate = TextDatumGetCString(datum);
- datum = SysCacheGetAttr(COLLOID, tp, Anum_pg_collation_collctype, &isnull);
- Assert(!isnull);
- collctype = TextDatumGetCString(datum);
-
- cache_entry->collate_is_c = ((strcmp(collcollate, "C") == 0) ||
- (strcmp(collcollate, "POSIX") == 0));
- cache_entry->ctype_is_c = ((strcmp(collctype, "C") == 0) ||
- (strcmp(collctype, "POSIX") == 0));
+ if (collform->collprovider == COLLPROVIDER_LIBC)
+ {
+ Datum datum;
+ bool isnull;
+ const char *collcollate;
+ const char *collctype;
+
+ datum = SysCacheGetAttr(COLLOID, tp, Anum_pg_collation_collcollate, &isnull);
+ Assert(!isnull);
+ collcollate = TextDatumGetCString(datum);
+ datum = SysCacheGetAttr(COLLOID, tp, Anum_pg_collation_collctype, &isnull);
+ Assert(!isnull);
+ collctype = TextDatumGetCString(datum);
+
+ cache_entry->collate_is_c = ((strcmp(collcollate, "C") == 0) ||
+ (strcmp(collcollate, "POSIX") == 0));
+ cache_entry->ctype_is_c = ((strcmp(collctype, "C") == 0) ||
+ (strcmp(collctype, "POSIX") == 0));
+ }
+ else
+ {
+ cache_entry->collate_is_c = false;
+ cache_entry->ctype_is_c = false;
+ }
cache_entry->flags_valid = true;
static int result = -1;
char *localeptr;
+ if (default_locale.provider == COLLPROVIDER_ICU)
+ return false;
+
if (result >= 0)
return (bool) result;
localeptr = setlocale(LC_COLLATE, NULL);
static int result = -1;
char *localeptr;
+ if (default_locale.provider == COLLPROVIDER_ICU)
+ return false;
+
if (result >= 0)
return (bool) result;
localeptr = setlocale(LC_CTYPE, NULL);
return (lookup_collation_cache(collation, true))->ctype_is_c;
}
+struct pg_locale_struct default_locale;
+
+void
+make_icu_collator(const char *iculocstr,
+ struct pg_locale_struct *resultp)
+{
+#ifdef USE_ICU
+ UCollator *collator;
+ UErrorCode status;
+
+ status = U_ZERO_ERROR;
+ collator = ucol_open(iculocstr, &status);
+ if (U_FAILURE(status))
+ ereport(ERROR,
+ (errmsg("could not open collator for locale \"%s\": %s",
+ iculocstr, u_errorName(status))));
+
+ if (U_ICU_VERSION_MAJOR_NUM < 54)
+ icu_set_collation_attributes(collator, iculocstr);
+
+ /* We will leak this string if the caller errors later :-( */
+ resultp->info.icu.locale = MemoryContextStrdup(TopMemoryContext, iculocstr);
+ resultp->info.icu.ucol = collator;
+#else /* not USE_ICU */
+ /* could get here if a collation was created by a build with ICU */
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("ICU is not supported in this build"), \
+ errhint("You need to rebuild PostgreSQL using %s.", "--with-icu")));
+#endif /* not USE_ICU */
+}
+
/* simple subroutine for reporting errors from newlocale() */
#ifdef HAVE_LOCALE_T
Assert(OidIsValid(collid));
if (collid == DEFAULT_COLLATION_OID)
- return (pg_locale_t) 0;
+ {
+ if (default_locale.provider == COLLPROVIDER_ICU)
+ return &default_locale;
+ else
+ return (pg_locale_t) 0;
+ }
cache_entry = lookup_collation_cache(collid, false);
/* We haven't computed this yet in this session, so do it */
HeapTuple tp;
Form_pg_collation collform;
- const char *collcollate;
- const char *collctype pg_attribute_unused();
struct pg_locale_struct result;
pg_locale_t resultp;
Datum datum;
elog(ERROR, "cache lookup failed for collation %u", collid);
collform = (Form_pg_collation) GETSTRUCT(tp);
- datum = SysCacheGetAttr(COLLOID, tp, Anum_pg_collation_collcollate, &isnull);
- Assert(!isnull);
- collcollate = TextDatumGetCString(datum);
- datum = SysCacheGetAttr(COLLOID, tp, Anum_pg_collation_collctype, &isnull);
- Assert(!isnull);
- collctype = TextDatumGetCString(datum);
-
/* We'll fill in the result struct locally before allocating memory */
memset(&result, 0, sizeof(result));
result.provider = collform->collprovider;
if (collform->collprovider == COLLPROVIDER_LIBC)
{
#ifdef HAVE_LOCALE_T
+ const char *collcollate;
+ const char *collctype pg_attribute_unused();
locale_t loc;
+ datum = SysCacheGetAttr(COLLOID, tp, Anum_pg_collation_collcollate, &isnull);
+ Assert(!isnull);
+ collcollate = TextDatumGetCString(datum);
+ datum = SysCacheGetAttr(COLLOID, tp, Anum_pg_collation_collctype, &isnull);
+ Assert(!isnull);
+ collctype = TextDatumGetCString(datum);
+
if (strcmp(collcollate, collctype) == 0)
{
/* Normal case where they're the same */
}
else if (collform->collprovider == COLLPROVIDER_ICU)
{
-#ifdef USE_ICU
- UCollator *collator;
- UErrorCode status;
-
- if (strcmp(collcollate, collctype) != 0)
- ereport(ERROR,
- (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
- errmsg("collations with different collate and ctype values are not supported by ICU")));
-
- status = U_ZERO_ERROR;
- collator = ucol_open(collcollate, &status);
- if (U_FAILURE(status))
- ereport(ERROR,
- (errmsg("could not open collator for locale \"%s\": %s",
- collcollate, u_errorName(status))));
+ const char *iculocstr;
- if (U_ICU_VERSION_MAJOR_NUM < 54)
- icu_set_collation_attributes(collator, collcollate);
-
- /* We will leak this string if we get an error below :-( */
- result.info.icu.locale = MemoryContextStrdup(TopMemoryContext,
- collcollate);
- result.info.icu.ucol = collator;
-#else /* not USE_ICU */
- /* could get here if a collation was created by a build with ICU */
- ereport(ERROR,
- (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
- errmsg("ICU is not supported in this build"), \
- errhint("You need to rebuild PostgreSQL using %s.", "--with-icu")));
-#endif /* not USE_ICU */
+ datum = SysCacheGetAttr(COLLOID, tp, Anum_pg_collation_colliculocale, &isnull);
+ Assert(!isnull);
+ iculocstr = TextDatumGetCString(datum);
+ make_icu_collator(iculocstr, &result);
}
datum = SysCacheGetAttr(COLLOID, tp, Anum_pg_collation_collversion,
collversionstr = TextDatumGetCString(datum);
- actual_versionstr = get_collation_actual_version(collform->collprovider, collcollate);
+ datum = SysCacheGetAttr(COLLOID, tp, collform->collprovider == COLLPROVIDER_ICU ? Anum_pg_collation_colliculocale : Anum_pg_collation_collcollate, &isnull);
+ Assert(!isnull);
+
+ actual_versionstr = get_collation_actual_version(collform->collprovider,
+ TextDatumGetCString(datum));
if (!actual_versionstr)
{
/*
bool isnull;
char *collate;
char *ctype;
+ char *iculocale;
/* Fetch our pg_database row normally, via syscache */
tup = SearchSysCache1(DATABASEOID, ObjectIdGetDatum(MyDatabaseId));
" which is not recognized by setlocale().", ctype),
errhint("Recreate the database with another locale or install the missing locale.")));
+ if (dbform->datlocprovider == COLLPROVIDER_ICU)
+ {
+ datum = SysCacheGetAttr(DATABASEOID, tup, Anum_pg_database_daticulocale, &isnull);
+ Assert(!isnull);
+ iculocale = TextDatumGetCString(datum);
+ make_icu_collator(iculocale, &default_locale);
+ }
+ else
+ iculocale = NULL;
+
+ default_locale.provider = dbform->datlocprovider;
+ /*
+ * Default locale is currently always deterministic. Nondeterministic
+ * locales currently don't support pattern matching, which would break a
+ * lot of things if applied globally.
+ */
+ default_locale.deterministic = true;
+
/*
* Check collation version. See similar code in
* pg_newlocale_from_collation(). Note that here we warn instead of error
collversionstr = TextDatumGetCString(datum);
- actual_versionstr = get_collation_actual_version(COLLPROVIDER_LIBC, collate);
+ actual_versionstr = get_collation_actual_version(dbform->datlocprovider, dbform->datlocprovider == COLLPROVIDER_ICU ? iculocale : collate);
if (!actual_versionstr)
ereport(WARNING,
(errmsg("database \"%s\" has no actual collation version, but a version was recorded",
all: initdb
initdb: $(OBJS) | submake-libpq submake-libpgport submake-libpgfeutils
- $(CC) $(CFLAGS) $(OBJS) $(LDFLAGS) $(LDFLAGS_EX) $(LIBS) -o $@$(X)
+ $(CC) $(CFLAGS) $(OBJS) $(LDFLAGS) $(LDFLAGS_EX) $(LIBS) $(ICU_LIBS) -o $@$(X)
# We must pull in localtime.c from src/timezones
localtime.c: % : $(top_srcdir)/src/timezone/%
# ensure that changes in datadir propagate into object file
initdb.o: initdb.c $(top_builddir)/src/Makefile.global
+export with_icu
+
check:
$(prove_check)
#include <signal.h>
#include <time.h>
+#ifdef USE_ICU
+#include <unicode/ucol.h>
+#endif
+
#ifdef HAVE_SHM_OPEN
#include "sys/mman.h"
#endif
static char *lc_numeric = NULL;
static char *lc_time = NULL;
static char *lc_messages = NULL;
+static char locale_provider = COLLPROVIDER_LIBC;
+static char *icu_locale = NULL;
static const char *default_text_search_config = NULL;
static char *username = NULL;
static bool pwprompt = false;
bki_lines = replace_token(bki_lines, "LC_CTYPE",
escape_quotes_bki(lc_ctype));
+ bki_lines = replace_token(bki_lines, "ICU_LOCALE",
+ locale_provider == COLLPROVIDER_ICU ? escape_quotes_bki(icu_locale) : "_null_");
+
+ sprintf(buf, "%c", locale_provider);
+ bki_lines = replace_token(bki_lines, "LOCALE_PROVIDER", buf);
+
/* Also ensure backend isn't confused by this environment var: */
unsetenv("PGCLIENTENCODING");
* canonicalize locale names, and obtain any missing values from our
* current environment
*/
-
check_locale_name(LC_CTYPE, lc_ctype, &canonname);
lc_ctype = canonname;
check_locale_name(LC_COLLATE, lc_collate, &canonname);
check_locale_name(LC_CTYPE, lc_messages, &canonname);
lc_messages = canonname;
#endif
+
+ if (locale_provider == COLLPROVIDER_ICU)
+ {
+ if (!icu_locale)
+ {
+ pg_log_error("ICU locale must be specified");
+ exit(1);
+ }
+
+ /*
+ * Check ICU locale ID
+ */
+#ifdef USE_ICU
+ {
+ UErrorCode status;
+
+ status = U_ZERO_ERROR;
+ ucol_open(icu_locale, &status);
+ if (U_FAILURE(status))
+ {
+ pg_log_error("could not open collator for locale \"%s\": %s",
+ icu_locale, u_errorName(status));
+ exit(1);
+ }
+ }
+#else
+ pg_log_error("ICU is not supported in this build");
+ fprintf(stderr, _("You need to rebuild PostgreSQL using %s.\n"), "--with-icu");
+ exit(1);
+#endif
+ }
}
/*
printf(_(" [-D, --pgdata=]DATADIR location for this database cluster\n"));
printf(_(" -E, --encoding=ENCODING set default encoding for new databases\n"));
printf(_(" -g, --allow-group-access allow group read/execute on data directory\n"));
+ printf(_(" --icu-locale=LOCALE set ICU locale ID for new databases\n"));
printf(_(" -k, --data-checksums use data page checksums\n"));
printf(_(" --locale=LOCALE set default locale for new databases\n"));
printf(_(" --lc-collate=, --lc-ctype=, --lc-messages=LOCALE\n"
" set default locale in the respective category for\n"
" new databases (default taken from environment)\n"));
printf(_(" --no-locale equivalent to --locale=C\n"));
+ printf(_(" --locale-provider={libc|icu}\n"
+ " set default locale provider for new databases\n"));
printf(_(" --pwfile=FILE read password for the new superuser from file\n"));
printf(_(" -T, --text-search-config=CFG\n"
" default text search configuration\n"));
{
setlocales();
- if (strcmp(lc_ctype, lc_collate) == 0 &&
+ if (locale_provider == COLLPROVIDER_LIBC &&
+ strcmp(lc_ctype, lc_collate) == 0 &&
strcmp(lc_ctype, lc_time) == 0 &&
strcmp(lc_ctype, lc_numeric) == 0 &&
strcmp(lc_ctype, lc_monetary) == 0 &&
- strcmp(lc_ctype, lc_messages) == 0)
+ strcmp(lc_ctype, lc_messages) == 0 &&
+ (!icu_locale || strcmp(lc_ctype, icu_locale) == 0))
printf(_("The database cluster will be initialized with locale \"%s\".\n"), lc_ctype);
else
{
- printf(_("The database cluster will be initialized with locales\n"
- " COLLATE: %s\n"
- " CTYPE: %s\n"
- " MESSAGES: %s\n"
- " MONETARY: %s\n"
- " NUMERIC: %s\n"
- " TIME: %s\n"),
+ printf(_("The database cluster will be initialized with this locale configuration:\n"));
+ printf(_(" provider: %s\n"), collprovider_name(locale_provider));
+ if (icu_locale)
+ printf(_(" ICU locale: %s\n"), icu_locale);
+ printf(_(" LC_COLLATE: %s\n"
+ " LC_CTYPE: %s\n"
+ " LC_MESSAGES: %s\n"
+ " LC_MONETARY: %s\n"
+ " LC_NUMERIC: %s\n"
+ " LC_TIME: %s\n"),
lc_collate,
lc_ctype,
lc_messages,
lc_time);
}
- if (!encoding)
+ if (!encoding && locale_provider == COLLPROVIDER_ICU)
+ encodingid = PG_UTF8;
+ else if (!encoding)
{
int ctype_enc;
{"data-checksums", no_argument, NULL, 'k'},
{"allow-group-access", no_argument, NULL, 'g'},
{"discard-caches", no_argument, NULL, 14},
+ {"locale-provider", required_argument, NULL, 15},
+ {"icu-locale", required_argument, NULL, 16},
{NULL, 0, NULL, 0}
};
extra_options,
"-c debug_discard_caches=1");
break;
+ case 15:
+ if (strcmp(optarg, "icu") == 0)
+ locale_provider = COLLPROVIDER_ICU;
+ else if (strcmp(optarg, "libc") == 0)
+ locale_provider = COLLPROVIDER_LIBC;
+ else
+ {
+ pg_log_error("unrecognized locale provider: %s", optarg);
+ exit(1);
+ }
+ break;
+ case 16:
+ icu_locale = pg_strdup(optarg);
+ break;
default:
/* getopt_long already emitted a complaint */
fprintf(stderr, _("Try \"%s --help\" for more information.\n"),
exit(1);
}
+ if (icu_locale && locale_provider != COLLPROVIDER_ICU)
+ {
+ pg_log_error("%s cannot be specified unless locale provider \"%s\" is chosen",
+ "--icu-locale", "icu");
+ exit(1);
+ }
+
atexit(cleanup_directories_atexit);
/* If we only need to fsync, just do it and exit */
'check PGDATA permissions');
}
+# Locale provider tests
+
+if ($ENV{with_icu} eq 'yes')
+{
+ command_fails_like(['initdb', '--no-sync', '--locale-provider=icu', "$tempdir/data2"],
+ qr/initdb: error: ICU locale must be specified/,
+ 'locale provider ICU requires --icu-locale');
+
+ command_ok(['initdb', '--no-sync', '--locale-provider=icu', '--icu-locale=en', "$tempdir/data3"],
+ 'option --icu-locale');
+
+ command_fails_like(['initdb', '--no-sync', '--locale-provider=icu', '--icu-locale=@colNumeric=lower', "$tempdir/dataX"],
+ qr/initdb: error: could not open collator for locale/,
+ 'fails for invalid ICU locale');
+}
+else
+{
+ command_fails(['initdb', '--no-sync', '--locale-provider=icu', "$tempdir/data2"],
+ 'locale provider ICU fails since no ICU support');
+}
+
+command_fails(['initdb', '--no-sync', '--locale-provider=xyz', "$tempdir/dataX"],
+ 'fails for invalid locale provider');
+
+command_fails(['initdb', '--no-sync', '--locale-provider=libc', '--icu-locale=en', "$tempdir/dataX"],
+ 'fails for invalid option combination');
+
done_testing();
i_datname,
i_datdba,
i_encoding,
+ i_datlocprovider,
i_collate,
i_ctype,
+ i_daticulocale,
i_frozenxid,
i_minmxid,
i_datacl,
const char *datname,
*dba,
*encoding,
+ *datlocprovider,
*collate,
*ctype,
+ *iculocale,
*datistemplate,
*datconnlimit,
*tablespace;
else
appendPQExpBuffer(dbQry, "0 AS datminmxid, ");
if (fout->remoteVersion >= 150000)
- appendPQExpBuffer(dbQry, "datcollversion, ");
+ appendPQExpBuffer(dbQry, "datlocprovider, daticulocale, datcollversion, ");
else
- appendPQExpBuffer(dbQry, "NULL AS datcollversion, ");
+ appendPQExpBuffer(dbQry, "'c' AS datlocprovider, NULL AS daticulocale, NULL AS datcollversion, ");
appendPQExpBuffer(dbQry,
"(SELECT spcname FROM pg_tablespace t WHERE t.oid = dattablespace) AS tablespace, "
"shobj_description(oid, 'pg_database') AS description "
i_datname = PQfnumber(res, "datname");
i_datdba = PQfnumber(res, "datdba");
i_encoding = PQfnumber(res, "encoding");
+ i_datlocprovider = PQfnumber(res, "datlocprovider");
i_collate = PQfnumber(res, "datcollate");
i_ctype = PQfnumber(res, "datctype");
+ i_daticulocale = PQfnumber(res, "daticulocale");
i_frozenxid = PQfnumber(res, "datfrozenxid");
i_minmxid = PQfnumber(res, "datminmxid");
i_datacl = PQfnumber(res, "datacl");
datname = PQgetvalue(res, 0, i_datname);
dba = getRoleName(PQgetvalue(res, 0, i_datdba));
encoding = PQgetvalue(res, 0, i_encoding);
+ datlocprovider = PQgetvalue(res, 0, i_datlocprovider);
collate = PQgetvalue(res, 0, i_collate);
ctype = PQgetvalue(res, 0, i_ctype);
+ if (!PQgetisnull(res, 0, i_daticulocale))
+ iculocale = PQgetvalue(res, 0, i_daticulocale);
+ else
+ iculocale = NULL;
frozenxid = atooid(PQgetvalue(res, 0, i_frozenxid));
minmxid = atooid(PQgetvalue(res, 0, i_minmxid));
dbdacl.acl = PQgetvalue(res, 0, i_datacl);
appendPQExpBufferStr(creaQry, " ENCODING = ");
appendStringLiteralAH(creaQry, encoding, fout);
}
+
+ appendPQExpBufferStr(creaQry, " LOCALE_PROVIDER = ");
+ if (datlocprovider[0] == 'c')
+ appendPQExpBufferStr(creaQry, "libc");
+ else if (datlocprovider[0] == 'i')
+ appendPQExpBufferStr(creaQry, "icu");
+ else
+ fatal("unrecognized locale provider: %s",
+ datlocprovider);
+
if (strlen(collate) > 0 && strcmp(collate, ctype) == 0)
{
appendPQExpBufferStr(creaQry, " LOCALE = ");
appendStringLiteralAH(creaQry, ctype, fout);
}
}
+ if (iculocale)
+ {
+ appendPQExpBufferStr(creaQry, " ICU_LOCALE = ");
+ appendStringLiteralAH(creaQry, iculocale, fout);
+ }
/*
* For binary upgrade, carry over the collation version. For normal
#include "postgres_fe.h"
#include "catalog/pg_authid_d.h"
+#include "catalog/pg_collation.h"
#include "fe_utils/string_utils.h"
#include "mb/pg_wchar.h"
#include "pg_upgrade.h"
if (!equivalent_locale(LC_CTYPE, olddb->db_ctype, newdb->db_ctype))
pg_fatal("lc_ctype values for database \"%s\" do not match: old \"%s\", new \"%s\"\n",
olddb->db_name, olddb->db_ctype, newdb->db_ctype);
+ if (olddb->db_collprovider != newdb->db_collprovider)
+ pg_fatal("locale providers for database \"%s\" do not match: old \"%s\", new \"%s\"\n",
+ olddb->db_name,
+ collprovider_name(olddb->db_collprovider),
+ collprovider_name(newdb->db_collprovider));
+ if ((olddb->db_iculocale == NULL && newdb->db_iculocale != NULL) ||
+ (olddb->db_iculocale != NULL && newdb->db_iculocale == NULL) ||
+ (olddb->db_iculocale != NULL && newdb->db_iculocale != NULL && strcmp(olddb->db_iculocale, newdb->db_iculocale) != 0))
+ pg_fatal("ICU locale values for database \"%s\" do not match: old \"%s\", new \"%s\"\n",
+ olddb->db_name,
+ olddb->db_iculocale ? olddb->db_iculocale : "(null)",
+ newdb->db_iculocale ? newdb->db_iculocale : "(null)");
}
/*
i_encoding,
i_datcollate,
i_datctype,
+ i_datlocprovider,
+ i_daticulocale,
i_spclocation;
char query[QUERY_ALLOC];
snprintf(query, sizeof(query),
- "SELECT d.oid, d.datname, d.encoding, d.datcollate, d.datctype, "
+ "SELECT d.oid, d.datname, d.encoding, d.datcollate, d.datctype, ");
+ if (GET_MAJOR_VERSION(old_cluster.major_version) <= 1500)
+ snprintf(query + strlen(query), sizeof(query) - strlen(query),
+ "'c' AS datlocprovider, NULL AS daticulocale, ");
+ else
+ snprintf(query + strlen(query), sizeof(query) - strlen(query),
+ "datlocprovider, daticulocale, ");
+ snprintf(query + strlen(query), sizeof(query) - strlen(query),
"pg_catalog.pg_tablespace_location(t.oid) AS spclocation "
"FROM pg_catalog.pg_database d "
" LEFT OUTER JOIN pg_catalog.pg_tablespace t "
i_encoding = PQfnumber(res, "encoding");
i_datcollate = PQfnumber(res, "datcollate");
i_datctype = PQfnumber(res, "datctype");
+ i_datlocprovider = PQfnumber(res, "datlocprovider");
+ i_daticulocale = PQfnumber(res, "daticulocale");
i_spclocation = PQfnumber(res, "spclocation");
ntups = PQntuples(res);
dbinfos[tupnum].db_encoding = atoi(PQgetvalue(res, tupnum, i_encoding));
dbinfos[tupnum].db_collate = pg_strdup(PQgetvalue(res, tupnum, i_datcollate));
dbinfos[tupnum].db_ctype = pg_strdup(PQgetvalue(res, tupnum, i_datctype));
+ dbinfos[tupnum].db_collprovider = PQgetvalue(res, tupnum, i_datlocprovider)[0];
+ if (PQgetisnull(res, tupnum, i_daticulocale))
+ dbinfos[tupnum].db_iculocale = NULL;
+ else
+ dbinfos[tupnum].db_iculocale = pg_strdup(PQgetvalue(res, tupnum, i_daticulocale));
snprintf(dbinfos[tupnum].db_tablespace, sizeof(dbinfos[tupnum].db_tablespace), "%s",
PQgetvalue(res, tupnum, i_spclocation));
}
* path */
char *db_collate;
char *db_ctype;
+ char db_collprovider;
+ char *db_iculocale;
int db_encoding;
RelInfoArr rel_arr; /* array of all user relinfos */
} DbInfo;
gettext_noop("Encoding"),
gettext_noop("Collate"),
gettext_noop("Ctype"));
+ if (pset.sversion >= 150000)
+ appendPQExpBuffer(&buf,
+ " d.daticulocale as \"%s\",\n"
+ " CASE d.datlocprovider WHEN 'c' THEN 'libc' WHEN 'i' THEN 'icu' END AS \"%s\",\n",
+ gettext_noop("ICU Locale"),
+ gettext_noop("Locale Provider"));
+ else
+ appendPQExpBuffer(&buf,
+ " d.datcollate as \"%s\",\n"
+ " 'libc' AS \"%s\",\n",
+ gettext_noop("ICU Locale"),
+ gettext_noop("Locale Provider"));
appendPQExpBufferStr(&buf, " ");
printACLColumn(&buf, "d.datacl");
if (verbose)
PQExpBufferData buf;
PGresult *res;
printQueryOpt myopt = pset.popt;
- static const bool translate_columns[] = {false, false, false, false, false, true, false};
+ static const bool translate_columns[] = {false, false, false, false, false, false, true, false};
initPQExpBuffer(&buf);
gettext_noop("Collate"),
gettext_noop("Ctype"));
+ if (pset.sversion >= 150000)
+ appendPQExpBuffer(&buf,
+ ",\n c.colliculocale AS \"%s\"",
+ gettext_noop("ICU Locale"));
+ else
+ appendPQExpBuffer(&buf,
+ ",\n c.collcollate AS \"%s\"",
+ gettext_noop("ICU Locale"));
+
if (pset.sversion >= 100000)
appendPQExpBuffer(&buf,
",\n CASE c.collprovider WHEN 'd' THEN 'default' WHEN 'c' THEN 'libc' WHEN 'i' THEN 'icu' END AS \"%s\"",
COMPLETE_WITH("OWNER", "TEMPLATE", "ENCODING", "TABLESPACE",
"IS_TEMPLATE",
"ALLOW_CONNECTIONS", "CONNECTION LIMIT",
- "LC_COLLATE", "LC_CTYPE", "LOCALE", "OID");
+ "LC_COLLATE", "LC_CTYPE", "LOCALE", "OID",
+ "LOCALE_PROVIDER", "ICU_LOCALE");
else if (Matches("CREATE", "DATABASE", MatchAny, "TEMPLATE"))
COMPLETE_WITH_QUERY(Query_for_list_of_template_databases);
rm -f common.o $(WIN32RES)
rm -rf tmp_check
+export with_icu
+
check:
$(prove_check)
{"lc-ctype", required_argument, NULL, 2},
{"locale", required_argument, NULL, 'l'},
{"maintenance-db", required_argument, NULL, 3},
+ {"locale-provider", required_argument, NULL, 4},
+ {"icu-locale", required_argument, NULL, 5},
{NULL, 0, NULL, 0}
};
char *lc_collate = NULL;
char *lc_ctype = NULL;
char *locale = NULL;
+ char *locale_provider = NULL;
+ char *icu_locale = NULL;
PQExpBufferData sql;
case 3:
maintenance_db = pg_strdup(optarg);
break;
+ case 4:
+ locale_provider = pg_strdup(optarg);
+ break;
+ case 5:
+ icu_locale = pg_strdup(optarg);
+ break;
default:
fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
exit(1);
appendPQExpBufferStr(&sql, " LC_CTYPE ");
appendStringLiteralConn(&sql, lc_ctype, conn);
}
+ if (locale_provider)
+ appendPQExpBuffer(&sql, " LOCALE_PROVIDER %s", locale_provider);
+ if (icu_locale)
+ {
+ appendPQExpBufferStr(&sql, " ICU_LOCALE ");
+ appendStringLiteralConn(&sql, icu_locale, conn);
+ }
appendPQExpBufferChar(&sql, ';');
printf(_(" -l, --locale=LOCALE locale settings for the database\n"));
printf(_(" --lc-collate=LOCALE LC_COLLATE setting for the database\n"));
printf(_(" --lc-ctype=LOCALE LC_CTYPE setting for the database\n"));
+ printf(_(" --icu-locale=LOCALE ICU locale setting for the database\n"));
+ printf(_(" --locale-provider={libc|icu}\n"
+ " locale provider for the database's default collation\n"));
printf(_(" -O, --owner=OWNER database user to own the new database\n"));
printf(_(" -T, --template=TEMPLATE template database to copy\n"));
printf(_(" -V, --version output version information, then exit\n"));
qr/statement: CREATE DATABASE foobar2 ENCODING 'LATIN1'/,
'create database with encoding');
+if ($ENV{with_icu} eq 'yes')
+{
+ # This fails because template0 uses libc provider and has no ICU
+ # locale set. It would succeed if template0 used the icu
+ # provider. XXX Maybe split into multiple tests?
+ $node->command_fails(
+ [ 'createdb', '-T', 'template0', '--locale-provider=icu', 'foobar4' ],
+ 'create database with ICU fails without ICU locale specified');
+
+ $node->issues_sql_like(
+ [ 'createdb', '-T', 'template0', '--locale-provider=icu', '--icu-locale=en', 'foobar5' ],
+ qr/statement: CREATE DATABASE foobar5 .* LOCALE_PROVIDER icu ICU_LOCALE 'en'/,
+ 'create database with ICU locale specified');
+
+ $node->command_fails(
+ [ 'createdb', '-T', 'template0', '--locale-provider=icu', '--icu-locale=@colNumeric=lower', 'foobarX' ],
+ 'fails for invalid ICU locale');
+}
+else
+{
+ $node->command_fails(
+ [ 'createdb', '-T', 'template0', '--locale-provider=icu', 'foobar4' ],
+ 'create database with ICU fails since no ICU support');
+}
+
$node->command_fails([ 'createdb', 'foobar1' ],
'fails if database already exists');
+$node->command_fails([ 'createdb', '-T', 'template0', '--locale-provider=xyz', 'foobarX' ],
+ 'fails for invalid locale provider');
+
# Check use of templates with shared dependencies copied from the template.
my ($ret, $stdout, $stderr) = $node->psql(
'foobar2',
*/
/* yyyymmddN */
-#define CATALOG_VERSION_NO 202203141
+#define CATALOG_VERSION_NO 202203171
#endif
{ oid => '100', oid_symbol => 'DEFAULT_COLLATION_OID',
descr => 'database\'s default collation',
- collname => 'default', collprovider => 'd', collencoding => '-1',
- collcollate => '', collctype => '' },
+ collname => 'default', collprovider => 'd', collencoding => '-1' },
{ oid => '950', oid_symbol => 'C_COLLATION_OID',
descr => 'standard C collation',
collname => 'C', collprovider => 'c', collencoding => '-1',
bool collisdeterministic BKI_DEFAULT(t);
int32 collencoding; /* encoding for this collation; -1 = "all" */
#ifdef CATALOG_VARLEN /* variable-length fields start here */
- text collcollate BKI_FORCE_NOT_NULL; /* LC_COLLATE setting */
- text collctype BKI_FORCE_NOT_NULL; /* LC_CTYPE setting */
+ text collcollate BKI_DEFAULT(_null_); /* LC_COLLATE setting */
+ text collctype BKI_DEFAULT(_null_); /* LC_CTYPE setting */
+ text colliculocale BKI_DEFAULT(_null_); /* ICU locale ID */
text collversion BKI_DEFAULT(_null_); /* provider-dependent
* version of collation
* data */
#define COLLPROVIDER_ICU 'i'
#define COLLPROVIDER_LIBC 'c'
+static inline const char *
+collprovider_name(char c)
+{
+ switch (c)
+ {
+ case COLLPROVIDER_ICU:
+ return "icu";
+ case COLLPROVIDER_LIBC:
+ return "libc";
+ default:
+ return "???";
+ }
+}
+
#endif /* EXPOSE_TO_CLIENT_CODE */
bool collisdeterministic,
int32 collencoding,
const char *collcollate, const char *collctype,
+ const char *colliculocale,
const char *collversion,
bool if_not_exists,
bool quiet);
{ oid => '1', oid_symbol => 'TemplateDbOid',
descr => 'default template for new databases',
- datname => 'template1', encoding => 'ENCODING', datistemplate => 't',
+ datname => 'template1', encoding => 'ENCODING', datlocprovider => 'LOCALE_PROVIDER', datistemplate => 't',
datallowconn => 't', datconnlimit => '-1', datfrozenxid => '0',
datminmxid => '1', dattablespace => 'pg_default', datcollate => 'LC_COLLATE',
- datctype => 'LC_CTYPE', datacl => '_null_' },
+ datctype => 'LC_CTYPE', daticulocale => 'ICU_LOCALE', datacl => '_null_' },
]
/* character encoding */
int32 encoding;
+ /* locale provider, see pg_collation.collprovider */
+ char datlocprovider;
+
/* allowed as CREATE DATABASE template? */
bool datistemplate;
/* LC_CTYPE setting */
text datctype BKI_FORCE_NOT_NULL;
+ /* ICU locale ID */
+ text daticulocale;
+
/* provider-dependent version of collation data */
text datcollversion BKI_DEFAULT(_null_);
typedef struct pg_locale_struct *pg_locale_t;
+extern struct pg_locale_struct default_locale;
+
+extern void make_icu_collator(const char *iculocstr,
+ struct pg_locale_struct *resultp);
+
extern pg_locale_t pg_newlocale_from_collation(Oid collid);
extern char *get_collation_actual_version(char collprovider, const char *collcollate);
SUBDIRS = perl regress isolation modules authentication recovery subscription
+ifeq ($(with_icu),yes)
+SUBDIRS += icu
+endif
+
# Test suites that are not safe by default but can be run if selected
# by the user via the whitespace-separated list in variable
# PG_TEST_EXTRA:
# clean" etc to recurse into them. (We must filter out those that we
# have conditionally included into SUBDIRS above, else there will be
# make confusion.)
-ALWAYS_SUBDIRS = $(filter-out $(SUBDIRS),examples kerberos ldap ssl)
+ALWAYS_SUBDIRS = $(filter-out $(SUBDIRS),examples kerberos icu ldap ssl)
# We want to recurse to all subdirs for all standard targets, except that
# installcheck and install should not recurse into the subdirectory "modules".
--- /dev/null
+# Generated by test suite
+/tmp_check/
--- /dev/null
+#-------------------------------------------------------------------------
+#
+# Makefile for src/test/icu
+#
+# Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+# Portions Copyright (c) 1994, Regents of the University of California
+#
+# src/test/icu/Makefile
+#
+#-------------------------------------------------------------------------
+
+subdir = src/test/icu
+top_builddir = ../../..
+include $(top_builddir)/src/Makefile.global
+
+export with_icu
+
+check:
+ $(prove_check)
+
+installcheck:
+ $(prove_installcheck)
+
+clean distclean maintainer-clean:
+ rm -rf tmp_check
--- /dev/null
+src/test/icu/README
+
+Regression tests for ICU functionality
+======================================
+
+This directory contains a test suite for ICU functionality.
+
+Running the tests
+=================
+
+NOTE: You must have given the --enable-tap-tests argument to configure.
+Also, to use "make installcheck", you must have built and installed
+contrib/hstore in addition to the core code.
+
+Run
+ make check
+or
+ make installcheck
+You can use "make installcheck" if you previously did "make install".
+In that case, the code in the installation tree is tested. With
+"make check", a temporary installation tree is built from the current
+sources and then tested.
+
+Either way, this test initializes, starts, and stops several test Postgres
+clusters.
+
+See src/test/perl/README for more info about running these tests.
--- /dev/null
+# Copyright (c) 2022, PostgreSQL Global Development Group
+
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+if ($ENV{with_icu} ne 'yes')
+{
+ plan skip_all => 'ICU not supported by this build';
+}
+
+my $node1 = PostgreSQL::Test::Cluster->new('node1');
+$node1->init;
+$node1->start;
+
+$node1->safe_psql('postgres',
+ q{CREATE DATABASE dbicu LOCALE_PROVIDER icu LOCALE 'C' ICU_LOCALE 'en-u-kf-upper' TEMPLATE template0});
+
+$node1->safe_psql('dbicu',
+q{
+CREATE COLLATION upperfirst (provider = icu, locale = 'en-u-kf-upper');
+CREATE TABLE icu (def text, en text COLLATE "en-x-icu", upfirst text COLLATE upperfirst);
+INSERT INTO icu VALUES ('a', 'a', 'a'), ('b', 'b', 'b'), ('A', 'A', 'A'), ('B', 'B', 'B');
+});
+
+is($node1->safe_psql('dbicu', q{SELECT def FROM icu ORDER BY def}),
+ qq(A
+a
+B
+b),
+ 'sort by database default locale');
+
+is($node1->safe_psql('dbicu', q{SELECT def FROM icu ORDER BY def COLLATE "en-x-icu"}),
+ qq(a
+A
+b
+B),
+ 'sort by explicit collation standard');
+
+is($node1->safe_psql('dbicu', q{SELECT def FROM icu ORDER BY en COLLATE upperfirst}),
+ qq(A
+a
+B
+b),
+ 'sort by explicit collation upper first');
+
+
+# Test error cases in CREATE DATABASE involving locale-related options
+
+my ($ret, $stdout, $stderr) = $node1->psql('postgres',
+ q{CREATE DATABASE dbicu LOCALE_PROVIDER icu TEMPLATE template0});
+isnt($ret, 0, "ICU locale must be specified for ICU provider: exit code not 0");
+like($stderr, qr/ERROR: ICU locale must be specified/, "ICU locale must be specified for ICU provider: error message");
+
+
+done_testing();
ERROR: collation "test0" already exists
do $$
BEGIN
- EXECUTE 'CREATE COLLATION test1 (provider = icu, lc_collate = ' ||
- quote_literal(current_setting('lc_collate')) ||
- ', lc_ctype = ' ||
- quote_literal(current_setting('lc_ctype')) || ');';
+ EXECUTE 'CREATE COLLATION test1 (provider = icu, locale = ' ||
+ quote_literal(current_setting('lc_collate')) || ');';
END
$$;
-CREATE COLLATION test3 (provider = icu, lc_collate = 'en_US.utf8'); -- fail, need lc_ctype
-ERROR: parameter "lc_ctype" must be specified
+CREATE COLLATION test3 (provider = icu, lc_collate = 'en_US.utf8'); -- fail, needs "locale"
+ERROR: parameter "locale" must be specified
CREATE COLLATION testx (provider = icu, locale = 'nonsense'); /* never fails with ICU */ DROP COLLATION testx;
CREATE COLLATION test4 FROM nonsense;
ERROR: collation "nonsense" for encoding "UTF8" does not exist
CREATE COLLATION test0 FROM "C"; -- fail, duplicate name
do $$
BEGIN
- EXECUTE 'CREATE COLLATION test1 (provider = icu, lc_collate = ' ||
- quote_literal(current_setting('lc_collate')) ||
- ', lc_ctype = ' ||
- quote_literal(current_setting('lc_ctype')) || ');';
+ EXECUTE 'CREATE COLLATION test1 (provider = icu, locale = ' ||
+ quote_literal(current_setting('lc_collate')) || ');';
END
$$;
-CREATE COLLATION test3 (provider = icu, lc_collate = 'en_US.utf8'); -- fail, need lc_ctype
+CREATE COLLATION test3 (provider = icu, lc_collate = 'en_US.utf8'); -- fail, needs "locale"
CREATE COLLATION testx (provider = icu, locale = 'nonsense'); /* never fails with ICU */ DROP COLLATION testx;
CREATE COLLATION test4 FROM nonsense;