Extract column statistics from CTE references, if possible.

examine_simple_variable() left this as an unimplemented case years ago, with the result that plans for queries involving un-flattened CTEs might be much stupider than necessary. It's not hard to extend the existing logic for RTE_SUBQUERY cases to also be able to drill down into CTEs, so let's do that. There was some discussion of whether this patch breaks the idea of a MATERIALIZED CTE being an optimization fence. We concluded it's okay, because we already allow the outer planner level to see the estimated width and rowcount of the CTE result, and letting it see column statistics too seems fairly equivalent. Basically, what we expect of the optimization fence is that the outer query should not affect the plan chosen for the CTE query. Once that plan is chosen, it's okay for the outer planner level to make use of whatever information we have about it. Jian Guo and Tom Lane, per complaint from Hans Buschmann Discussion: https://postgr.es/m/4504e67078d648cdac3651b2960da6e7@nidsa.net
author: Tom Lane 2023-11-17 19:36:23 +0000
committer: Tom Lane 2023-11-17 19:36:23 +0000
commit: f7816aec23eed1dc1da5f9a53cb6507d30b7f0a2 (patch)
tree: d0018a8f1e729865ed27ca0b3083f724f5647a53 /src/test
parent: 06c70849fb26ac431a722b1d10cffe1c65e728a4 (diff)
2 files changed, 25 insertions, 0 deletions
diff --git a/src/test/regress/expected/with.out b/src/test/regress/expected/with.out
index a01efa50a51..69c56ce2077 100644
--- a/src/test/regress/expected/with.out
+++ b/src/test/regress/expected/with.out
@@ -636,6 +636,24 @@ SELECT t1.id, t2.path, t2 FROM t AS t1 JOIN t AS t2 ON
  16 | {3,7,11,16} | (16,"{3,7,11,16}")
 (16 rows)
 
+-- test that column statistics from a materialized CTE are available
+-- to upper planner (otherwise, we'd get a stupider plan)
+explain (costs off)
+with x as materialized (select unique1 from tenk1 b)
+select count(*) from tenk1 a
+  where unique1 in (select * from x);
+                         QUERY PLAN                         
+------------------------------------------------------------
+ Aggregate
+   CTE x
+     ->  Index Only Scan using tenk1_unique1 on tenk1 b
+   ->  Hash Semi Join
+         Hash Cond: (a.unique1 = x.unique1)
+         ->  Index Only Scan using tenk1_unique1 on tenk1 a
+         ->  Hash
+               ->  CTE Scan on x
+(8 rows)
+
 -- SEARCH clause
 create temp table graph0( f int, t int, label text );
 insert into graph0 values
diff --git a/src/test/regress/sql/with.sql b/src/test/regress/sql/with.sql
index 582139df7bd..3ef98988663 100644
--- a/src/test/regress/sql/with.sql
+++ b/src/test/regress/sql/with.sql
@@ -347,6 +347,13 @@ UNION ALL
 SELECT t1.id, t2.path, t2 FROM t AS t1 JOIN t AS t2 ON
 (t1.id=t2.id);
 
+-- test that column statistics from a materialized CTE are available
+-- to upper planner (otherwise, we'd get a stupider plan)
+explain (costs off)
+with x as materialized (select unique1 from tenk1 b)
+select count(*) from tenk1 a
+  where unique1 in (select * from x);
+
 -- SEARCH clause
 
 create temp table graph0( f int, t int, label text );
author	Tom Lane	2023-11-17 19:36:23 +0000
committer	Tom Lane	2023-11-17 19:36:23 +0000
commit	f7816aec23eed1dc1da5f9a53cb6507d30b7f0a2 (patch)
tree	d0018a8f1e729865ed27ca0b3083f724f5647a53 /src/test
parent	06c70849fb26ac431a722b1d10cffe1c65e728a4 (diff)