pgsql-server/src/backend/optimizer/util pathnode.c

Lists: pgsql-committers
From: tgl(at)svr1(dot)postgresql(dot)org (Tom Lane)
To: pgsql-committers(at)postgresql(dot)org
Subject: pgsql-server/src/backend/optimizer/util pathnode.c
Date: 2004-03-02 16:42:20
Message-ID: 20040302164220.76C09D1B49B@svr1.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers

CVSROOT: /cvsroot
Module name: pgsql-server
Changes by: tgl(at)svr1(dot)postgresql(dot)org 04/03/02 12:42:20

Modified files:
src/backend/optimizer/util: pathnode.c

Log message:
Teach is_distinct_query to recognize that GROUP BY forces a subquery's
output to be distinct, if all the GROUP BY columns appear in the output.
Per suggestion from Dennis Haney.


From: Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au>
To: Tom Lane <tgl(at)svr1(dot)postgresql(dot)org>
Cc: pgsql-committers(at)postgresql(dot)org
Subject: Re: pgsql-server/src/backend/optimizer/util pathnode.c
Date: 2004-03-03 04:14:31
Message-ID: 40455BA7.8010003@familyhealth.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers

> Teach is_distinct_query to recognize that GROUP BY forces a subquery's
> output to be distinct, if all the GROUP BY columns appear in the output.
> Per suggestion from Dennis Haney.

Will this have should-be-in-release-notes side effects just as the
results of a DISTINCT over a GROUP BY no longer being sorted?

Chris


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au>
Cc: pgsql-committers(at)postgresql(dot)org
Subject: Re: pgsql-server/src/backend/optimizer/util pathnode.c
Date: 2004-03-03 05:14:58
Message-ID: 8629.1078290898@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers

Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au> writes:
>> Teach is_distinct_query to recognize that GROUP BY forces a subquery's
>> output to be distinct, if all the GROUP BY columns appear in the output.
>> Per suggestion from Dennis Haney.

> Will this have should-be-in-release-notes side effects just as the
> results of a DISTINCT over a GROUP BY no longer being sorted?

No, there is no visible semantic change AFAICS. This just eliminates
planning silliness like sorting or hashing an already-sorted-or-hashed
subplan. Here is an example using the regression database: in 7.4

regression=# explain select * from tenk1 a where
regression-# unique1 in (select ten from tenk1 b group by ten);
QUERY PLAN
-------------------------------------------------------------------------------------
Nested Loop (cost=483.43..543.63 rows=10 width=244)
-> HashAggregate (cost=483.43..483.43 rows=10 width=4)
-> Subquery Scan "IN_subquery" (cost=483.30..483.40 rows=10 width=4)
-> HashAggregate (cost=483.30..483.30 rows=10 width=4)
-> Seq Scan on tenk1 b (cost=0.00..458.24 rows=10024 width=4)
-> Index Scan using tenk1_unique1 on tenk1 a (cost=0.00..6.01 rows=1 width=244)
Index Cond: (a.unique1 = "outer".ten)
(7 rows)

where CVS tip gives

QUERY PLAN
-------------------------------------------------------------------------------------
Nested Loop (cost=483.30..543.60 rows=10 width=244)
-> Subquery Scan "IN_subquery" (cost=483.30..483.40 rows=10 width=4)
-> HashAggregate (cost=483.30..483.30 rows=10 width=4)
-> Seq Scan on tenk1 b (cost=0.00..458.24 rows=10024 width=4)
-> Index Scan using tenk1_unique1 on tenk1 a (cost=0.00..6.01 rows=1 width=244)
Index Cond: (a.unique1 = "outer".ten)
(6 rows)

(In both cases, the HashAggregate nodes are being used to eliminate
duplicate rows.)

regards, tom lane