From 0d115dde82bf368ae0f9755113bd8fd53ac0b64b Mon Sep 17 00:00:00 2001 From: Tom Lane Date: Tue, 7 Oct 2008 19:27:04 +0000 Subject: Extend CTE patch to support recursive UNION (ie, without ALL). The implementation uses an in-memory hash table, so it will poop out for very large recursive results ... but the performance characteristics of a sort-based implementation would be pretty unpleasant too. --- doc/src/sgml/queries.sgml | 33 ++++++++++++++++++++------------- doc/src/sgml/ref/select.sgml | 8 ++++---- 2 files changed, 24 insertions(+), 17 deletions(-) (limited to 'doc/src') diff --git a/doc/src/sgml/queries.sgml b/doc/src/sgml/queries.sgml index b3d72ceb7f8..c2e3807920f 100644 --- a/doc/src/sgml/queries.sgml +++ b/doc/src/sgml/queries.sgml @@ -1,4 +1,4 @@ - + Queries @@ -1519,7 +1519,8 @@ SELECT sum(n) FROM t; The general form of a recursive WITH query is always a - non-recursive term, then UNION ALL, then a + non-recursive term, then UNION (or + UNION ALL), then a recursive term, where only the recursive term can contain a reference to the query's own output. Such a query is executed as follows: @@ -1530,9 +1531,10 @@ SELECT sum(n) FROM t; - Evaluate the non-recursive term. Include all its output rows in the - result of the recursive query, and also place them in a temporary - working table. + Evaluate the non-recursive term. For UNION (but not + UNION ALL), discard duplicate rows. Include all remaining + rows in the result of the recursive query, and also place them in a + temporary working table. @@ -1544,9 +1546,11 @@ SELECT sum(n) FROM t; Evaluate the recursive term, substituting the current contents of - the working table for the recursive self-reference. Include all its - output rows in the result of the recursive query, and also place them - in a temporary intermediate table. + the working table for the recursive self-reference. + For UNION (but not UNION ALL), discard + duplicate rows and rows that duplicate any previous result row. + Include all remaining rows in the result of the recursive query, and + also place them in a temporary intermediate table. @@ -1598,10 +1602,13 @@ GROUP BY sub_part When working with recursive queries it is important to be sure that the recursive part of the query will eventually return no tuples, - or else the query will loop indefinitely. A useful trick for - development purposes is to place a LIMIT in the parent - query. For example, this query would loop forever without the - LIMIT: + or else the query will loop indefinitely. Sometimes, using + UNION instead of UNION ALL can accomplish this + by discarding rows that duplicate previous output rows; this catches + cycles that would otherwise repeat. A useful trick for testing queries + when you are not certain if they might loop is to place a LIMIT + in the parent query. For example, this query would loop forever without + the LIMIT: WITH RECURSIVE t(n) AS ( @@ -1614,7 +1621,7 @@ SELECT n FROM t LIMIT 100; This works because PostgreSQL's implementation evaluates only as many rows of a WITH query as are actually - demanded by the parent query. Using this trick in production is not + fetched by the parent query. Using this trick in production is not recommended, because other systems might work differently. diff --git a/doc/src/sgml/ref/select.sgml b/doc/src/sgml/ref/select.sgml index e72d9c126f6..f73ca6ed64a 100644 --- a/doc/src/sgml/ref/select.sgml +++ b/doc/src/sgml/ref/select.sgml @@ -1,5 +1,5 @@ @@ -202,10 +202,10 @@ and with_query is: subquery to reference itself by name. Such a subquery must have the form -non_recursive_term UNION ALL recursive_term +non_recursive_term UNION [ ALL ] recursive_term where the recursive self-reference must appear on the right-hand - side of UNION ALL. Only one recursive self-reference + side of the UNION. Only one recursive self-reference is permitted per query. @@ -1234,7 +1234,7 @@ SELECT distance, employee_name FROM employee_recursive; Notice the typical form of recursive queries: - an initial condition, followed by UNION ALL, + an initial condition, followed by UNION, followed by the recursive part of the query. Be sure that the recursive part of the query will eventually return no tuples, or else the query will loop indefinitely. (See -- cgit v1.2.3