Skip to content

Commit 310144e

Browse files
jimjonesbrCommitfest Bot
authored andcommitted
Add XMLCast function (SQL/XML X025)
This patch introduces support for the XMLCAST function, as specified in SQL/XML:2023 (ISO/IEC 9075-14:2023), Subclause 6.7 "<XML cast specification>". It enables standards-compliant conversion between SQL data types and XML, following the lexical rules defined by the W3C XML Schema Part 2. XMLCast provides an alternative to CAST when converting SQL values into XML content, ensuring the output uses canonical XML Schema lexical representations. For example, timestamp and interval values are rendered as `xs:dateTime` and `xs:duration` (e.g., "2024-01-01T12:00:00Z" or "P1Y2M"), conforming to ISO 8601 formats. Conversely, XMLCast also allows converting XML content back into SQL types (e.g., boolean, numeric, date/time), validating the input string according to XML Schema lexical forms. Supported casts include: - SQL -> XML: boolean, numeric, character, date/time, interval - XML -> SQL: the inverse of the above, with lexical validation The BY REF and BY VALUE clauses are accepted for SQL/XML compatibility, but ignored. Documentation and regression tests are included.
1 parent ea06263 commit 310144e

File tree

18 files changed

+2276
-10
lines changed

18 files changed

+2276
-10
lines changed

doc/src/sgml/datatype.sgml

Lines changed: 83 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -4496,14 +4496,93 @@ XMLPARSE ( { DOCUMENT | CONTENT } <replaceable>value</replaceable>)
44964496
XMLPARSE (DOCUMENT '<?xml version="1.0"?><book><title>Manual</title><chapter>...</chapter></book>')
44974497
XMLPARSE (CONTENT 'abc<foo>bar</foo><bar>foo</bar>')
44984498
]]></programlisting>
4499-
While this is the only way to convert character strings into XML
4500-
values according to the SQL standard, the PostgreSQL-specific
4501-
syntaxes:
4499+
4500+
Another option for converting values to or from <type>xml</type> is the <function>xmlcast</function> function,
4501+
which is designed to cast SQL data types into <type>xml</type>, and vice versa, in a standards-compliant way.
4502+
<synopsis>
4503+
XMLCAST ( <replaceable>expression</replaceable> AS <replaceable>type</replaceable> [ BY REF | BY VALUE ] )
4504+
</synopsis>
4505+
Similar to the SQL function <function>CAST</function>, this function converts an <replaceable>expression</replaceable>
4506+
into the specified <replaceable>type</replaceable>. It is primarily used for converting between SQL values
4507+
and <type>xml</type> values in a standards-compliant way.
4508+
4509+
Unlike <function>CAST</function>, which may coerce SQL values into text or XML without enforcing a specific
4510+
lexical representation, <function>xmlcast</function> ensures that the conversion produces or expects a
4511+
canonical XML Schema lexical form appropriate for the target type. For example, an <type>interval</type>
4512+
value is rendered as <literal>P1Y2M</literal> (<type>xs:duration</type>), and a <type>timestamp</type> as
4513+
<literal>2023-05-19T14:30:00Z</literal> (xs:dateTime). Similarly, when converting from XML to SQL types,
4514+
<function>xmlcast</function> validates that the input string conforms to the lexical format required by the
4515+
corresponding SQL type.
4516+
4517+
The function <function>xmlcast</function> follows these rules:
4518+
4519+
<itemizedlist>
4520+
<listitem>
4521+
<para>
4522+
Either <replaceable>expression</replaceable> or <replaceable>type</replaceable> must be of type <type>xml</type>.
4523+
</para>
4524+
</listitem>
4525+
<listitem>
4526+
<para>
4527+
It supports casting between <type>xml</type> and character, numeric, date/time, and boolean data types.
4528+
</para>
4529+
</listitem>
4530+
<listitem>
4531+
<para>
4532+
Similar to the function <function>xmltext</function>, <replaceable>expression</replaceable>
4533+
values containing XML predefined entities will be escaped (see examples below).
4534+
</para>
4535+
</listitem>
4536+
<listitem>
4537+
<para>
4538+
Values of type <type>date</type>, <type>time with time zone</type>, <type>timestamp with time zone</type>,
4539+
and <type>interval</type> are converted to their corresponding XML Schema types: <type>xs:date</type>,
4540+
<type>xs:time</type>, <type>xs:dateTime</type>, and <type>xs:duration</type>, respectively.
4541+
</para>
4542+
</listitem>
4543+
<listitem>
4544+
<para>
4545+
The <literal>BY REF</literal> and <literal>BY VALUE</literal> clauses
4546+
are accepted but ignored, as discussed in
4547+
<xref linkend="functions-xml-limits-postgresql"/>.
4548+
</para>
4549+
</listitem>
4550+
</itemizedlist>
4551+
4552+
Examples:
4553+
<screen><![CDATA[
4554+
SELECT xmlcast('<foo&bar>'::text AS xml);
4555+
xmlcast
4556+
---------------------
4557+
&lt;foo&amp;bar&gt;
4558+
4559+
SELECT xmlcast('&lt;foo&amp;bar&gt;'::xml AS text);
4560+
xmlcast
4561+
-----------
4562+
<foo&bar>
4563+
4564+
SELECT xmlcast(CURRENT_TIMESTAMP AS xml);
4565+
xmlcast
4566+
---------------------------------
4567+
2024-06-02T00:29:40.92397+02:00
4568+
4569+
SELECT xmlcast('P1Y2M3W4DT5H6M7S'::xml AS interval);
4570+
xmlcast
4571+
--------------------------------
4572+
1 year 2 mons 25 days 05:06:07
4573+
4574+
SELECT xmlcast('1 year 2 months 3 weeks 4 days 5 hours 6 minutes 7 seconds'::interval AS xml);
4575+
xmlcast
4576+
-----------------
4577+
P1Y2M25DT5H6M7S
4578+
]]></screen>
4579+
4580+
Alternatively, it is also possible to convert character strings into XML using PostgreSQL-specific cast syntaxes:
45024581
<programlisting><![CDATA[
45034582
xml '<foo>bar</foo>'
45044583
'<foo>bar</foo>'::xml
45054584
]]></programlisting>
4506-
can also be used.
4585+
45074586
</para>
45084587

45094588
<para>

src/backend/catalog/sql_features.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -624,7 +624,7 @@ X014 Attributes of XML type YES
624624
X015 Fields of XML type NO
625625
X016 Persistent XML values YES
626626
X020 XMLConcat YES
627-
X025 XMLCast NO
627+
X025 XMLCast YES
628628
X030 XMLDocument NO
629629
X031 XMLElement YES
630630
X032 XMLForest YES

src/backend/executor/execExprInterp.c

Lines changed: 120 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -69,6 +69,7 @@
6969
#include "utils/array.h"
7070
#include "utils/builtins.h"
7171
#include "utils/date.h"
72+
#include "utils/datetime.h"
7273
#include "utils/datum.h"
7374
#include "utils/expandedrecord.h"
7475
#include "utils/json.h"
@@ -4643,11 +4644,127 @@ ExecEvalXmlExpr(ExprState *state, ExprEvalStep *op)
46434644
*op->resnull = false;
46444645
}
46454646
break;
4647+
case IS_XMLCAST:
4648+
{
4649+
Datum *argvalue = op->d.xmlexpr.argvalue;
4650+
bool *argnull = op->d.xmlexpr.argnull;
4651+
char *str;
46464652

4647-
default:
4648-
elog(ERROR, "unrecognized XML operation");
4653+
Assert(list_length(xexpr->args) == 1);
4654+
4655+
if (argnull[0])
4656+
return;
4657+
4658+
value = argvalue[0];
4659+
4660+
switch (xexpr->targetType)
4661+
{
4662+
case XMLOID:
4663+
/*
4664+
* SQL date/time types must be mapped to XML Schema types when casting to XML:
4665+
* - DATE -> xs:date
4666+
* - TIME [WITH/WITHOUT TZ] -> xs:time
4667+
* - TIMESTAMP [WITH/WITHOUT TZ] -> xs:dateTime
4668+
*
4669+
* These mappings are defined in SQL/XML:2023 (ISO/IEC 9075-14:2023),
4670+
* Subclause 6.7 "<XML cast specification>", item 15.e.i–v.
4671+
*
4672+
* The corresponding XML Schema lexical formats (e.g., "2023-05-19", "14:30:00Z",
4673+
* "2023-05-19T14:30:00+01:00") follow ISO 8601 and are specified in
4674+
* W3C XML Schema Part 2: Primitive Datatypes §3.2.7 (dateTime) and §3.2.9 (date).
4675+
*/
4676+
if (xexpr->type == TIMESTAMPOID || xexpr->type == TIMESTAMPTZOID ||
4677+
xexpr->type == DATEOID || xexpr->type == BYTEAOID || xexpr->type == BOOLOID)
4678+
{
4679+
text *mapped_value = cstring_to_text(
4680+
map_sql_value_to_xml_value(value, xexpr->type, false));
4681+
*op->resvalue = PointerGetDatum(mapped_value);
4682+
}
4683+
/*
4684+
* SQL interval types must be mapped to XML Schema types when casting to XML:
4685+
* - Year-month intervals → xs:yearMonthDuration
4686+
* - Day-time intervals → xs:dayTimeDuration
4687+
*
4688+
* This behavior is required by SQL/XML:2023 (ISO/IEC 9075-14:2023),
4689+
* Subclause 6.7 "<XML cast specification>", General Rules, item 3.d.ii.1–2.
4690+
*
4691+
* These XML Schema types require ISO 8601-compatible lexical representations,
4692+
* such as: "P1Y2M", "P3DT4H5M", or "P1Y2M3DT4H5M6S", as defined in
4693+
* W3C XML Schema Part 2: Primitve Datatypes, §3.2.6 (duration)
4694+
*/
4695+
else if (xexpr->type == INTERVALOID)
4696+
{
4697+
Interval *in = DatumGetIntervalP(value);
4698+
4699+
struct pg_itm tt, *itm = &tt;
4700+
char buf[MAXDATELEN + 1];
4701+
4702+
if (INTERVAL_NOT_FINITE(in))
4703+
{
4704+
if (INTERVAL_IS_NOBEGIN(in))
4705+
strcpy(buf, EARLY);
4706+
else if (INTERVAL_IS_NOEND(in))
4707+
strcpy(buf, LATE);
4708+
else
4709+
elog(ERROR, "invalid interval argument");
4710+
}
4711+
else
4712+
{
4713+
interval2itm(*in, itm);
4714+
EncodeInterval(itm, INTSTYLE_ISO_8601, buf);
4715+
}
4716+
4717+
*op->resvalue = PointerGetDatum(cstring_to_text(buf));
4718+
}
4719+
/* no need to escape the result, as the origin is also an XML */
4720+
else if (xexpr->type == XMLOID)
4721+
*op->resvalue = PointerGetDatum(DatumGetXmlP(value));
4722+
/* we make sure that potential predifined entitties are escaped */
4723+
else
4724+
*op->resvalue = PointerGetDatum(
4725+
DatumGetXmlP((DirectFunctionCall1(xmltext, value))));
4726+
break;
4727+
case TEXTOID:
4728+
case VARCHAROID:
4729+
case NAMEOID:
4730+
case BPCHAROID:
4731+
/*
4732+
* when casting from XML to a character string we make sure that
4733+
* all escaped xml characters are unescaped.
4734+
*/
4735+
str = text_to_cstring(DatumGetTextPP(value));
4736+
*op->resvalue = PointerGetDatum(
4737+
cstring_to_text(unescape_xml(str)));
4738+
4739+
pfree(str);
4740+
break;
4741+
case INT2OID:
4742+
case INT4OID:
4743+
case INT8OID:
4744+
case NUMERICOID:
4745+
case FLOAT4OID:
4746+
case FLOAT8OID:
4747+
case BOOLOID:
4748+
case TIMESTAMPOID:
4749+
case TIMESTAMPTZOID:
4750+
case TIMEOID:
4751+
case TIMETZOID:
4752+
case DATEOID:
4753+
case BYTEAOID:
4754+
case INTERVALOID:
4755+
*op->resvalue = PointerGetDatum(DatumGetTextP(value));
4756+
break;
4757+
default:
4758+
elog(ERROR, "unsupported target data type for XMLCast");
4759+
}
4760+
4761+
*op->resnull = false;
4762+
}
46494763
break;
4650-
}
4764+
default:
4765+
elog(ERROR, "unrecognized XML operation");
4766+
break;
4767+
}
46514768
}
46524769

46534770
/*

src/backend/nodes/nodeFuncs.c

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1736,6 +1736,9 @@ exprLocation(const Node *expr)
17361736
case T_FunctionParameter:
17371737
loc = ((const FunctionParameter *) expr)->location;
17381738
break;
1739+
case T_XmlCast:
1740+
loc = ((const XmlCast *) expr)->location;
1741+
break;
17391742
case T_XmlSerialize:
17401743
/* XMLSERIALIZE keyword should always be the first thing */
17411744
loc = ((const XmlSerialize *) expr)->location;
@@ -4468,6 +4471,16 @@ raw_expression_tree_walker_impl(Node *node,
44684471
return true;
44694472
}
44704473
break;
4474+
case T_XmlCast:
4475+
{
4476+
XmlCast *xc = (XmlCast *) node;
4477+
4478+
if (WALK(xc->expr))
4479+
return true;
4480+
if (WALK(xc->targetType))
4481+
return true;
4482+
}
4483+
break;
44714484
case T_CollateClause:
44724485
return WALK(((CollateClause *) node)->arg);
44734486
case T_SortBy:

src/backend/parser/gram.y

Lines changed: 21 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -787,7 +787,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
787787

788788
WHEN WHERE WHITESPACE_P WINDOW WITH WITHIN WITHOUT WORK WRAPPER WRITE
789789

790-
XML_P XMLATTRIBUTES XMLCONCAT XMLELEMENT XMLEXISTS XMLFOREST XMLNAMESPACES
790+
XML_P XMLATTRIBUTES XMLCAST XMLCONCAT XMLELEMENT XMLEXISTS XMLFOREST XMLNAMESPACES
791791
XMLPARSE XMLPI XMLROOT XMLSERIALIZE XMLTABLE
792792

793793
YEAR_P YES_P
@@ -16031,6 +16031,24 @@ func_expr_common_subexpr:
1603116031
v->location = @1;
1603216032
$$ = (Node *) v;
1603316033
}
16034+
| XMLCAST '(' a_expr AS Typename ')'
16035+
{
16036+
XmlCast *n = makeNode(XmlCast);
16037+
16038+
n->expr = $3;
16039+
n->targetType = $5;
16040+
n->location = @1;
16041+
$$ = (Node *) n;
16042+
}
16043+
| XMLCAST '(' a_expr AS Typename xml_passing_mech')'
16044+
{
16045+
XmlCast *n = makeNode(XmlCast);
16046+
16047+
n->expr = $3;
16048+
n->targetType = $5;
16049+
n->location = @1;
16050+
$$ = (Node *) n;
16051+
}
1603416052
| XMLCONCAT '(' expr_list ')'
1603516053
{
1603616054
$$ = makeXmlExpr(IS_XMLCONCAT, NULL, NIL, $3, @1);
@@ -18073,6 +18091,7 @@ col_name_keyword:
1807318091
| VALUES
1807418092
| VARCHAR
1807518093
| XMLATTRIBUTES
18094+
| XMLCAST
1807618095
| XMLCONCAT
1807718096
| XMLELEMENT
1807818097
| XMLEXISTS
@@ -18661,6 +18680,7 @@ bare_label_keyword:
1866118680
| WRITE
1866218681
| XML_P
1866318682
| XMLATTRIBUTES
18683+
| XMLCAST
1866418684
| XMLCONCAT
1866518685
| XMLELEMENT
1866618686
| XMLEXISTS

0 commit comments

Comments
 (0)