Previously, we skipped using search_indexed_tlist_for_sortgroupref()
if the tlist expression being sought in the child plan node was merely
a Var. This is purely an optimization, based on the theory that
search_indexed_tlist_for_var() is faster, and one copy of a Var should
be as good as another. However, the GROUPING SETS patch broke the
latter assumption: grouping columns containing the "same" Var can
sometimes have different outputs, as shown in the test case added here.
So do it the hard way whenever a ressortgroupref marking exists.
(If this seems like a bottleneck, we could imagine building a tlist index
data structure for ressortgroupref values, as we do for Vars. But I'll
let that idea go until there's some evidence it's worthwhile.)
Back-patch to 9.6. The problem also exists in 9.5 where GROUPING SETS
came in, but this patch is insufficient to resolve the problem in 9.5:
there is some obscure dependency on the upper-planner-pathification
work that happened in 9.6. Given that this is such a weird corner case,
and no end users have complained about it, it doesn't seem worth the work
to develop a fix for 9.5.
Patch by me, per a report from Heikki Linnakangas. (This does not fix
Heikki's original complaint, just the follow-on one.)
Discussion: https://postgr.es/m/
aefc657e-edb2-64d5-6df1-
a0828f6e9104@iki.fi
TargetEntry *tle = (TargetEntry *) lfirst(l);
Node *newexpr;
- /* If it's a non-Var sort/group item, first try to match by sortref */
- if (tle->ressortgroupref != 0 && !IsA(tle->expr, Var))
+ /* If it's a sort/group item, first try to match by sortref */
+ if (tle->ressortgroupref != 0)
{
newexpr = (Node *)
search_indexed_tlist_for_sortgroupref(tle->expr,
/*
* search_indexed_tlist_for_sortgroupref --- find a sort/group expression
- * (which is assumed not to be just a Var)
*
* If a match is found, return a Var constructed to reference the tlist item.
* If no match, return NULL.
2 | 2 | 2
(4 rows)
+-- check that distinct grouping columns are kept separate
+-- even if they are equal()
+explain (costs off)
+select g as alias1, g as alias2
+ from generate_series(1,3) g
+ group by alias1, rollup(alias2);
+ QUERY PLAN
+------------------------------------------------
+ GroupAggregate
+ Group Key: g, g
+ Group Key: g
+ -> Sort
+ Sort Key: g
+ -> Function Scan on generate_series g
+(6 rows)
+
+select g as alias1, g as alias2
+ from generate_series(1,3) g
+ group by alias1, rollup(alias2);
+ alias1 | alias2
+--------+--------
+ 1 | 1
+ 1 |
+ 2 | 2
+ 2 |
+ 3 | 3
+ 3 |
+(6 rows)
+
-- simple rescan tests
select a, b, sum(v.x)
from (values (1),(2)) v(x), gstest_data(v.x)
from gstest3
group by grouping sets ((a,b), (a,c));
+-- check that distinct grouping columns are kept separate
+-- even if they are equal()
+explain (costs off)
+select g as alias1, g as alias2
+ from generate_series(1,3) g
+ group by alias1, rollup(alias2);
+
+select g as alias1, g as alias2
+ from generate_series(1,3) g
+ group by alias1, rollup(alias2);
+
-- simple rescan tests
select a, b, sum(v.x)