Accept plan change in xc_groupby regression test

The plan changed in two ways. Firstly, the targetlists changed due to abandoning the custom distributed aggregation and reusing the upstream partial aggregation code. That means we're not prefixing the aggregate with schema name, etc. The plan also switches from distributed aggregation to plain aggregation with all the work done on top of a remote query. This happens simply due to costing, as the tables are tiny and two-phase aggregation has some overhead. The original implementation (as in XL 9.5) distributed the aggregate unconditionally, ignoring the costing. Parf of the problem is that the query groups by two columns from two different tables, resulting in overestimation of the number of groups. That means the optimizer thinks distributing the aggregation would not reduce the number of rows, which increases the cost estimate as each row requires network transfer and the finalize aggregate also depends on the number of input rows. We could make the tables larger and the optimizer would eventually switch to distributed aggregate. For example this seems to do the trick: insert into xc_groupby_tab1 select 1, mod(i,1000) from generate_series(1,20000) s(i); insert into xc_groupby_tab2 select 1, mod(i,1000) from generate_series(1,20000) s(i); But it does not seem worth it, considering it's just a workaround for the estimation issue and the increased duration. And we already have other regression tests testing plausible queries benefiting from distributed aggregation. So just accept the plan change.
author: Tomas Vondra 2017-07-28 22:00:09 +0000
committer: Tomas Vondra 2017-07-31 01:18:22 +0000
commit: f7dff6ae5b1a1eb16cc4b52308c741cc534faa4b (patch)
tree: 3b44e0e6844c9a16d17d04fc87c6c4f8e3c858e5 /src
parent: 69072746e165bf2b53b7f9851a4f7a2876cf5501 (diff)
1 files changed, 27 insertions, 27 deletions
diff --git a/src/test/regress/expected/xc_groupby.out b/src/test/regress/expected/xc_groupby.out
index cec10c1550..cb3d397ae3 100644
--- a/src/test/regress/expected/xc_groupby.out
+++ b/src/test/regress/expected/xc_groupby.out
@@ -5400,40 +5400,40 @@ select count(*), sum(xc_groupby_tab1.val * xc_groupby_tab2.val), avg(xc_groupby_
 (4 rows)
 
 explain (verbose true, costs false, nodes false) select count(*), sum(xc_groupby_tab1.val * xc_groupby_tab2.val), avg(xc_groupby_tab1.val*xc_groupby_tab2.val), sum(xc_groupby_tab1.val*xc_groupby_tab2.val)::float8/count(*), xc_groupby_tab1.val2, xc_groupby_tab2.val2 from xc_groupby_tab1 full outer join xc_groupby_tab2 on xc_groupby_tab1.val2 = xc_groupby_tab2.val2 group by xc_groupby_tab1.val2, xc_groupby_tab2.val2;
-                                                                                                                                                                      QUERY PLAN                                                                                                                                                                       
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+                                                                                                                                QUERY PLAN                                                                                                                                 
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  GroupAggregate
-   Output: pg_catalog.count(*), pg_catalog.sum((sum((xc_groupby_tab1.val * xc_groupby_tab2.val)))), pg_catalog.avg((avg((xc_groupby_tab1.val * xc_groupby_tab2.val)))), ((pg_catalog.sum((sum((xc_groupby_tab1.val * xc_groupby_tab2.val)))))::double precision / (pg_catalog.count(*))::double precision), xc_groupby_tab1.val2, xc_groupby_tab2.val2
+   Output: count(*), sum((xc_groupby_tab1.val * xc_groupby_tab2.val)), avg((xc_groupby_tab1.val * xc_groupby_tab2.val)), ((sum((xc_groupby_tab1.val * xc_groupby_tab2.val)))::double precision / (count(*))::double precision), xc_groupby_tab1.val2, xc_groupby_tab2.val2
    Group Key: xc_groupby_tab1.val2, xc_groupby_tab2.val2
    ->  Remote Subquery Scan on all
-         Output: count(*), sum((xc_groupby_tab1.val * xc_groupby_tab2.val)), avg((xc_groupby_tab1.val * xc_groupby_tab2.val)), xc_groupby_tab1.val2, xc_groupby_tab2.val2
-         ->  GroupAggregate
-               Output: count(*), sum((xc_groupby_tab1.val * xc_groupby_tab2.val)), avg((xc_groupby_tab1.val * xc_groupby_tab2.val)), xc_groupby_tab1.val2, xc_groupby_tab2.val2
-               Group Key: xc_groupby_tab1.val2, xc_groupby_tab2.val2
-               ->  Sort
+         Output: xc_groupby_tab1.val2, xc_groupby_tab2.val2, xc_groupby_tab1.val, xc_groupby_tab2.val
+         Sort Key: xc_groupby_tab1.val2, xc_groupby_tab2.val2
+         ->  Sort
+               Output: xc_groupby_tab1.val2, xc_groupby_tab2.val2, xc_groupby_tab1.val, xc_groupby_tab2.val
+               Sort Key: xc_groupby_tab1.val2, xc_groupby_tab2.val2
+               ->  Merge Full Join
                      Output: xc_groupby_tab1.val2, xc_groupby_tab2.val2, xc_groupby_tab1.val, xc_groupby_tab2.val
-                     Sort Key: xc_groupby_tab1.val2, xc_groupby_tab2.val2
-                     ->  Merge Full Join
-                           Output: xc_groupby_tab1.val2, xc_groupby_tab2.val2, xc_groupby_tab1.val, xc_groupby_tab2.val
-                           Merge Cond: (xc_groupby_tab1.val2 = xc_groupby_tab2.val2)
+                     Merge Cond: (xc_groupby_tab1.val2 = xc_groupby_tab2.val2)
+                     ->  Remote Subquery Scan on all
+                           Output: xc_groupby_tab1.val, xc_groupby_tab1.val2
+                           Distribute results by H: val2
+                           Sort Key: xc_groupby_tab1.val2
+                           ->  Sort
+                                 Output: xc_groupby_tab1.val, xc_groupby_tab1.val2
+                                 Sort Key: xc_groupby_tab1.val2
+                                 ->  Seq Scan on public.xc_groupby_tab1
+                                       Output: xc_groupby_tab1.val, xc_groupby_tab1.val2
+                     ->  Materialize
+                           Output: xc_groupby_tab2.val, xc_groupby_tab2.val2
                            ->  Remote Subquery Scan on all
-                                 Output: xc_groupby_tab1.val2, xc_groupby_tab1.val
+                                 Output: xc_groupby_tab2.val, xc_groupby_tab2.val2
                                  Distribute results by H: val2
+                                 Sort Key: xc_groupby_tab2.val2
                                  ->  Sort
-                                       Output: xc_groupby_tab1.val2, xc_groupby_tab1.val
-                                       Sort Key: xc_groupby_tab1.val2
-                                       ->  Seq Scan on public.xc_groupby_tab1
-                                             Output: xc_groupby_tab1.val2, xc_groupby_tab1.val
-                           ->  Materialize
-                                 Output: xc_groupby_tab2.val2, xc_groupby_tab2.val
-                                 ->  Remote Subquery Scan on all
-                                       Output: xc_groupby_tab2.val2, xc_groupby_tab2.val
-                                       Distribute results by H: val2
-                                       ->  Sort
-                                             Output: xc_groupby_tab2.val2, xc_groupby_tab2.val
-                                             Sort Key: xc_groupby_tab2.val2
-                                             ->  Seq Scan on public.xc_groupby_tab2
-                                                   Output: xc_groupby_tab2.val2, xc_groupby_tab2.val
+                                       Output: xc_groupby_tab2.val, xc_groupby_tab2.val2
+                                       Sort Key: xc_groupby_tab2.val2
+                                       ->  Seq Scan on public.xc_groupby_tab2
+                                             Output: xc_groupby_tab2.val, xc_groupby_tab2.val2
 (32 rows)
 
 -- aggregates over aggregates
author	Tomas Vondra	2017-07-28 22:00:09 +0000
committer	Tomas Vondra	2017-07-31 01:18:22 +0000
commit	f7dff6ae5b1a1eb16cc4b52308c741cc534faa4b (patch)
tree	3b44e0e6844c9a16d17d04fc87c6c4f8e3c858e5 /src
parent	69072746e165bf2b53b7f9851a4f7a2876cf5501 (diff)