Make Vars be outer-join-aware.

Traditionally we used the same Var struct to represent the value of a table column everywhere in parse and plan trees. This choice predates our support for SQL outer joins, and it's really a pretty bad idea with outer joins, because the Var's value can depend on where it is in the tree: it might go to NULL above an outer join. So expression nodes that are equal() per equalfuncs.c might not represent the same value, which is a huge correctness hazard for the planner. To improve this, decorate Var nodes with a bitmapset showing which outer joins (identified by RTE indexes) may have nulled them at the point in the parse tree where the Var appears. This allows us to trust that equal() Vars represent the same value. A certain amount of klugery is still needed to cope with cases where we re-order two outer joins, but it's possible to make it work without sacrificing that core principle. PlaceHolderVars receive similar decoration for the same reason. In the planner, we include these outer join bitmapsets into the relids that an expression is considered to depend on, and in consequence also add outer-join relids to the relids of join RelOptInfos. This allows us to correctly perceive whether an expression can be calculated above or below a particular outer join. This change affects FDWs that want to plan foreign joins. They *must* follow suit when labeling foreign joins in order to match with the core planner, but for many purposes (if postgres_fdw is any guide) they'd prefer to consider only base relations within the join. To support both requirements, redefine ForeignScan.fs_relids as base+OJ relids, and add a new field fs_base_relids that's set up by the core planner. Large though it is, this commit just does the minimum necessary to install the new mechanisms and get check-world passing again. Follow-up patches will perform some cleanup. (The README additions and comments mention some stuff that will appear in the follow-up.) Patch by me; thanks to Richard Guo for review. Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
author: Tom Lane 2023-01-30 18:16:20 +0000
committer: Tom Lane 2023-01-30 18:16:20 +0000
commit: 2489d76c4906f4461a364ca8ad7e0751ead8aa0d (patch)
tree: 145ebc28d5ea8f5a5ba340b9e353a11de786adae /src/test
parent: ec7e053a98f39a9e3c7e6d35f0d2e83933882399 (diff)
5 files changed, 266 insertions, 17 deletions
diff --git a/src/test/regress/expected/aggregates.out b/src/test/regress/expected/aggregates.out
index 8f96c4ed53..82d0961524 100644
--- a/src/test/regress/expected/aggregates.out
+++ b/src/test/regress/expected/aggregates.out
@@ -1370,8 +1370,8 @@ drop table p_t1;
 --
 -- Test GROUP BY matching of join columns that are type-coerced due to USING
 --
-create temp table t1(f1 int, f2 bigint);
-create temp table t2(f1 bigint, f22 bigint);
+create temp table t1(f1 int, f2 int);
+create temp table t2(f1 bigint, f2 oid);
 select f1 from t1 left join t2 using (f1) group by f1;
  f1 
 ----
@@ -1392,6 +1392,22 @@ select t1.f1 from t1 left join t2 using (f1) group by f1;
 ERROR:  column "t1.f1" must appear in the GROUP BY clause or be used in an aggregate function
 LINE 1: select t1.f1 from t1 left join t2 using (f1) group by f1;
                ^
+-- check case where we have to inject nullingrels into coerced join alias
+select f1, count(*) from
+t1 x(x0,x1) left join (t1 left join t2 using(f1)) on (x0 = 0)
+group by f1;
+ f1 | count 
+----+-------
+(0 rows)
+
+-- same, for a RelabelType coercion
+select f2, count(*) from
+t1 x(x0,x1) left join (t1 left join t2 using(f2)) on (x0 = 0)
+group by f2;
+ f2 | count 
+----+-------
+(0 rows)
+
 drop table t1, t2;
 --
 -- Test planner's selection of pathkeys for ORDER BY aggregates
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index c2b85d2795..51c9df3d58 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -2335,17 +2335,17 @@ select a.f1, b.f1, t.thousand, t.tenthous from
   (select sum(f1)+1 as f1 from int4_tbl i4a) a,
   (select sum(f1) as f1 from int4_tbl i4b) b
 where b.f1 = t.thousand and a.f1 = b.f1 and (a.f1+b.f1+999) = t.tenthous;
-                                                      QUERY PLAN                                                       
------------------------------------------------------------------------------------------------------------------------
+                                                   QUERY PLAN                                                    
+-----------------------------------------------------------------------------------------------------------------
  Nested Loop
-   ->  Aggregate
-         ->  Seq Scan on int4_tbl i4b
    ->  Nested Loop
          Join Filter: ((sum(i4b.f1)) = ((sum(i4a.f1) + 1)))
          ->  Aggregate
                ->  Seq Scan on int4_tbl i4a
-         ->  Index Only Scan using tenk1_thous_tenthous on tenk1 t
-               Index Cond: ((thousand = (sum(i4b.f1))) AND (tenthous = ((((sum(i4a.f1) + 1)) + (sum(i4b.f1))) + 999)))
+         ->  Aggregate
+               ->  Seq Scan on int4_tbl i4b
+   ->  Index Only Scan using tenk1_thous_tenthous on tenk1 t
+         Index Cond: ((thousand = (sum(i4b.f1))) AND (tenthous = ((((sum(i4a.f1) + 1)) + (sum(i4b.f1))) + 999)))
 (9 rows)
 
 select a.f1, b.f1, t.thousand, t.tenthous from
@@ -4139,6 +4139,60 @@ using (join_key);
 (2 rows)
 
 --
+-- check handling of a variable-free join alias
+--
+explain (verbose, costs off)
+select * from
+int4_tbl i0 left join
+( (select *, 123 as x from int4_tbl i1) ss1
+  left join
+  (select *, q2 as x from int8_tbl i2) ss2
+  using (x)
+) ss0
+on (i0.f1 = ss0.f1)
+order by i0.f1, x;
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Sort
+   Output: i0.f1, ('123'::bigint), i1.f1, i2.q1, i2.q2
+   Sort Key: i0.f1, ('123'::bigint)
+   ->  Hash Right Join
+         Output: i0.f1, ('123'::bigint), i1.f1, i2.q1, i2.q2
+         Hash Cond: (i1.f1 = i0.f1)
+         ->  Nested Loop Left Join
+               Output: i1.f1, i2.q1, i2.q2, '123'::bigint
+               ->  Seq Scan on public.int4_tbl i1
+                     Output: i1.f1
+               ->  Materialize
+                     Output: i2.q1, i2.q2
+                     ->  Seq Scan on public.int8_tbl i2
+                           Output: i2.q1, i2.q2
+                           Filter: (123 = i2.q2)
+         ->  Hash
+               Output: i0.f1
+               ->  Seq Scan on public.int4_tbl i0
+                     Output: i0.f1
+(19 rows)
+
+select * from
+int4_tbl i0 left join
+( (select *, 123 as x from int4_tbl i1) ss1
+  left join
+  (select *, q2 as x from int8_tbl i2) ss2
+  using (x)
+) ss0
+on (i0.f1 = ss0.f1)
+order by i0.f1, x;
+     f1      |  x  |     f1      |        q1        | q2  
+-------------+-----+-------------+------------------+-----
+ -2147483647 | 123 | -2147483647 | 4567890123456789 | 123
+     -123456 | 123 |     -123456 | 4567890123456789 | 123
+           0 | 123 |           0 | 4567890123456789 | 123
+      123456 | 123 |      123456 | 4567890123456789 | 123
+  2147483647 | 123 |  2147483647 | 4567890123456789 | 123
+(5 rows)
+
+--
 -- test successful handling of nested outer joins with degenerate join quals
 --
 explain (verbose, costs off)
@@ -4728,6 +4782,103 @@ select a.unique1, b.unique2
 (1 row)
 
 --
+-- test full-join strength reduction
+--
+explain (costs off)
+select a.unique1, b.unique2
+  from onek a full join onek b on a.unique1 = b.unique2
+  where a.unique1 = 42;
+                     QUERY PLAN                     
+----------------------------------------------------
+ Nested Loop Left Join
+   Join Filter: (a.unique1 = b.unique2)
+   ->  Index Only Scan using onek_unique1 on onek a
+         Index Cond: (unique1 = 42)
+   ->  Index Only Scan using onek_unique2 on onek b
+         Index Cond: (unique2 = 42)
+(6 rows)
+
+select a.unique1, b.unique2
+  from onek a full join onek b on a.unique1 = b.unique2
+  where a.unique1 = 42;
+ unique1 | unique2 
+---------+---------
+      42 |      42
+(1 row)
+
+explain (costs off)
+select a.unique1, b.unique2
+  from onek a full join onek b on a.unique1 = b.unique2
+  where b.unique2 = 43;
+                     QUERY PLAN                     
+----------------------------------------------------
+ Nested Loop Left Join
+   Join Filter: (a.unique1 = b.unique2)
+   ->  Index Only Scan using onek_unique2 on onek b
+         Index Cond: (unique2 = 43)
+   ->  Index Only Scan using onek_unique1 on onek a
+         Index Cond: (unique1 = 43)
+(6 rows)
+
+select a.unique1, b.unique2
+  from onek a full join onek b on a.unique1 = b.unique2
+  where b.unique2 = 43;
+ unique1 | unique2 
+---------+---------
+      43 |      43
+(1 row)
+
+explain (costs off)
+select a.unique1, b.unique2
+  from onek a full join onek b on a.unique1 = b.unique2
+  where a.unique1 = 42 and b.unique2 = 42;
+                     QUERY PLAN                     
+----------------------------------------------------
+ Nested Loop
+   ->  Index Only Scan using onek_unique1 on onek a
+         Index Cond: (unique1 = 42)
+   ->  Index Only Scan using onek_unique2 on onek b
+         Index Cond: (unique2 = 42)
+(5 rows)
+
+select a.unique1, b.unique2
+  from onek a full join onek b on a.unique1 = b.unique2
+  where a.unique1 = 42 and b.unique2 = 42;
+ unique1 | unique2 
+---------+---------
+      42 |      42
+(1 row)
+
+--
+-- test result-RTE removal underneath a full join
+--
+explain (costs off)
+select * from
+  (select * from int8_tbl i81 join (values(123,2)) v(v1,v2) on q2=v1) ss1
+full join
+  (select * from (values(456,2)) w(v1,v2) join int8_tbl i82 on q2=v1) ss2
+on true;
+              QUERY PLAN              
+--------------------------------------
+ Merge Full Join
+   ->  Seq Scan on int8_tbl i81
+         Filter: (q2 = 123)
+   ->  Materialize
+         ->  Seq Scan on int8_tbl i82
+               Filter: (q2 = 456)
+(6 rows)
+
+select * from
+  (select * from int8_tbl i81 join (values(123,2)) v(v1,v2) on q2=v1) ss1
+full join
+  (select * from (values(456,2)) w(v1,v2) join int8_tbl i82 on q2=v1) ss2
+on true;
+        q1        | q2  | v1  | v2 | v1  | v2 | q1  | q2  
+------------------+-----+-----+----+-----+----+-----+-----
+ 4567890123456789 | 123 | 123 |  2 | 456 |  2 | 123 | 456
+(1 row)
+
+--
 -- test join removal
 --
 begin;
diff --git a/src/test/regress/expected/partition_join.out b/src/test/regress/expected/partition_join.out
index c0ff13fb82..e18641ab92 100644
--- a/src/test/regress/expected/partition_join.out
+++ b/src/test/regress/expected/partition_join.out
@@ -304,7 +304,7 @@ SELECT t1.* FROM prt1 t1 WHERE t1.a IN (SELECT t2.b FROM prt2 t2 WHERE t2.a = 0)
                      ->  Seq Scan on prt2_p2 t2_2
                            Filter: (a = 0)
          ->  Nested Loop Semi Join
-               Join Filter: (t2_3.b = t1_3.a)
+               Join Filter: (t1_3.a = t2_3.b)
                ->  Seq Scan on prt1_p3 t1_3
                      Filter: (b = 0)
                ->  Materialize
@@ -601,7 +601,7 @@ SELECT t1.a, t1.c, t2.b, t2.c, t3.a + t3.b, t3.c FROM prt1 t1, prt2 t2, prt1_e t
    Sort Key: t1.a
    ->  Append
          ->  Nested Loop
-               Join Filter: (((t3_1.a + t3_1.b) / 2) = t1_1.a)
+               Join Filter: (t1_1.a = ((t3_1.a + t3_1.b) / 2))
                ->  Hash Join
                      Hash Cond: (t2_1.b = t1_1.a)
                      ->  Seq Scan on prt2_p1 t2_1
@@ -611,7 +611,7 @@ SELECT t1.a, t1.c, t2.b, t2.c, t3.a + t3.b, t3.c FROM prt1 t1, prt2 t2, prt1_e t
                ->  Index Scan using iprt1_e_p1_ab2 on prt1_e_p1 t3_1
                      Index Cond: (((a + b) / 2) = t2_1.b)
          ->  Nested Loop
-               Join Filter: (((t3_2.a + t3_2.b) / 2) = t1_2.a)
+               Join Filter: (t1_2.a = ((t3_2.a + t3_2.b) / 2))
                ->  Hash Join
                      Hash Cond: (t2_2.b = t1_2.a)
                      ->  Seq Scan on prt2_p2 t2_2
@@ -621,7 +621,7 @@ SELECT t1.a, t1.c, t2.b, t2.c, t3.a + t3.b, t3.c FROM prt1 t1, prt2 t2, prt1_e t
                ->  Index Scan using iprt1_e_p2_ab2 on prt1_e_p2 t3_2
                      Index Cond: (((a + b) / 2) = t2_2.b)
          ->  Nested Loop
-               Join Filter: (((t3_3.a + t3_3.b) / 2) = t1_3.a)
+               Join Filter: (t1_3.a = ((t3_3.a + t3_3.b) / 2))
                ->  Hash Join
                      Hash Cond: (t2_3.b = t1_3.a)
                      ->  Seq Scan on prt2_p3 t2_3
@@ -926,7 +926,7 @@ SELECT t1.* FROM prt1 t1 WHERE t1.a IN (SELECT t1.b FROM prt2 t1, prt1_e t2 WHER
    Sort Key: t1.a
    ->  Append
          ->  Nested Loop
-               Join Filter: (t1_5.b = t1_2.a)
+               Join Filter: (t1_2.a = t1_5.b)
                ->  HashAggregate
                      Group Key: t1_5.b
                      ->  Hash Join
@@ -939,7 +939,7 @@ SELECT t1.* FROM prt1 t1 WHERE t1.a IN (SELECT t1.b FROM prt2 t1, prt1_e t2 WHER
                      Index Cond: (a = ((t2_1.a + t2_1.b) / 2))
                      Filter: (b = 0)
          ->  Nested Loop
-               Join Filter: (t1_6.b = t1_3.a)
+               Join Filter: (t1_3.a = t1_6.b)
                ->  HashAggregate
                      Group Key: t1_6.b
                      ->  Hash Join
@@ -952,7 +952,7 @@ SELECT t1.* FROM prt1 t1 WHERE t1.a IN (SELECT t1.b FROM prt2 t1, prt1_e t2 WHER
                      Index Cond: (a = ((t2_2.a + t2_2.b) / 2))
                      Filter: (b = 0)
          ->  Nested Loop
-               Join Filter: (t1_7.b = t1_4.a)
+               Join Filter: (t1_4.a = t1_7.b)
                ->  HashAggregate
                      Group Key: t1_7.b
                      ->  Nested Loop
diff --git a/src/test/regress/sql/aggregates.sql b/src/test/regress/sql/aggregates.sql
index f6815a1784..e81a22465b 100644
--- a/src/test/regress/sql/aggregates.sql
+++ b/src/test/regress/sql/aggregates.sql
@@ -492,8 +492,8 @@ drop table p_t1;
 -- Test GROUP BY matching of join columns that are type-coerced due to USING
 --
 
-create temp table t1(f1 int, f2 bigint);
-create temp table t2(f1 bigint, f22 bigint);
+create temp table t1(f1 int, f2 int);
+create temp table t2(f1 bigint, f2 oid);
 
 select f1 from t1 left join t2 using (f1) group by f1;
 select f1 from t1 left join t2 using (f1) group by t1.f1;
@@ -501,6 +501,16 @@ select t1.f1 from t1 left join t2 using (f1) group by t1.f1;
 -- only this one should fail:
 select t1.f1 from t1 left join t2 using (f1) group by f1;
 
+-- check case where we have to inject nullingrels into coerced join alias
+select f1, count(*) from
+t1 x(x0,x1) left join (t1 left join t2 using(f1)) on (x0 = 0)
+group by f1;
+
+-- same, for a RelabelType coercion
+select f2, count(*) from
+t1 x(x0,x1) left join (t1 left join t2 using(f2)) on (x0 = 0)
+group by f2;
+
 drop table t1, t2;
 
 --
diff --git a/src/test/regress/sql/join.sql b/src/test/regress/sql/join.sql
index 027927354c..7035e4a4c4 100644
--- a/src/test/regress/sql/join.sql
+++ b/src/test/regress/sql/join.sql
@@ -1410,6 +1410,30 @@ left join
 using (join_key);
 
 --
+-- check handling of a variable-free join alias
+--
+explain (verbose, costs off)
+select * from
+int4_tbl i0 left join
+( (select *, 123 as x from int4_tbl i1) ss1
+  left join
+  (select *, q2 as x from int8_tbl i2) ss2
+  using (x)
+) ss0
+on (i0.f1 = ss0.f1)
+order by i0.f1, x;
+
+select * from
+int4_tbl i0 left join
+( (select *, 123 as x from int4_tbl i1) ss1
+  left join
+  (select *, q2 as x from int8_tbl i2) ss2
+  using (x)
+) ss0
+on (i0.f1 = ss0.f1)
+order by i0.f1, x;
+
+--
 -- test successful handling of nested outer joins with degenerate join quals
 --
 
@@ -1642,6 +1666,54 @@ select a.unique1, b.unique2
   where b.unique2 = any (select q1 from int8_tbl c where c.q1 < b.unique1);
 
 --
+-- test full-join strength reduction
+--
+
+explain (costs off)
+select a.unique1, b.unique2
+  from onek a full join onek b on a.unique1 = b.unique2
+  where a.unique1 = 42;
+
+select a.unique1, b.unique2
+  from onek a full join onek b on a.unique1 = b.unique2
+  where a.unique1 = 42;
+
+explain (costs off)
+select a.unique1, b.unique2
+  from onek a full join onek b on a.unique1 = b.unique2
+  where b.unique2 = 43;
+
+select a.unique1, b.unique2
+  from onek a full join onek b on a.unique1 = b.unique2
+  where b.unique2 = 43;
+
+explain (costs off)
+select a.unique1, b.unique2
+  from onek a full join onek b on a.unique1 = b.unique2
+  where a.unique1 = 42 and b.unique2 = 42;
+
+select a.unique1, b.unique2
+  from onek a full join onek b on a.unique1 = b.unique2
+  where a.unique1 = 42 and b.unique2 = 42;
+
+--
+-- test result-RTE removal underneath a full join
+--
+
+explain (costs off)
+select * from
+  (select * from int8_tbl i81 join (values(123,2)) v(v1,v2) on q2=v1) ss1
+full join
+  (select * from (values(456,2)) w(v1,v2) join int8_tbl i82 on q2=v1) ss2
+on true;
+
+select * from
+  (select * from int8_tbl i81 join (values(123,2)) v(v1,v2) on q2=v1) ss1
+full join
+  (select * from (values(456,2)) w(v1,v2) join int8_tbl i82 on q2=v1) ss2
+on true;
+
+--
 -- test join removal
 --
author	Tom Lane	2023-01-30 18:16:20 +0000
committer	Tom Lane	2023-01-30 18:16:20 +0000
commit	2489d76c4906f4461a364ca8ad7e0751ead8aa0d (patch)
tree	145ebc28d5ea8f5a5ba340b9e353a11de786adae /src/test
parent	ec7e053a98f39a9e3c7e6d35f0d2e83933882399 (diff)