Quick Links

Re: Distinct performance dropped by multiple times in v16

From:	Andrei Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
To:	Vitaliy Litovskiy <vitaliy(dot)litovskiy(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject:	Re: Distinct performance dropped by multiple times in v16
Date:	2024-06-11 07:10:45
Message-ID:	cfd31e5e-5e7b-4f6d-b2d4-d8f7484a6ed0@postgrespro.ru
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

On 6/10/24 13:59, Vitaliy Litovskiy wrote:
> ) tbl2 on tbl1.token = tbl2.token Observations:
> 1. query runs for 4-5 seconds on v16 and less than a second on v15 2. in
> v16 it also goes downs to less than a second if 2.1 distinct is removed
>
> 2.2 unnest is removed. it is not really needed for this particular data
> but this query is autogenerated and unnest makes sense for other data
>
> 2.3 "order by token" is uncommented, this is my current way of fixing
> the problem I would really appreciate some feedback if that is expected
> behaviour and if there are better solutions
The reason for this behaviour is simple: commit 3c6fc58 allowed using
incremental_sort with DISTINCT clauses.
So, in PostgreSQL 16 we have two concurrent strategies:
1. HashJoin + hashAgg at the end
2, NestLoop, which derives sort order from the index scan and planner
can utilise IncrementalSort+Unique instead of full sort.

Disabling Incremental Sort (see explain 2) we get good plan with
HashJoin at the top. Now we can see, that HashJoin definitely cheaper
according to total cost, but has a big startup cost. Optimiser compares
cost of incremental cost and, compare to hash agg it is much cheaper
because of a reason.

Here we already have a couple of questions:
1. Why optimiser overestimates in such a simple situation.
2. Why in the case of big numbers of tuples incremental sort is better
than hashAgg?

Before the next step just see how optimiser decides in the case of
correct prediction. I usually use the AQO extension for that. Executing
the query twice with the AQO in 'learn' mode we have correct plannedrows
number in each node of the plan and, as you can see in EXPLAIN 3,
optimiser chooses good strategy. So, having correct predictions,
optimiser ends up with optimal plan.

Origins of overestimation lie in internals of the unnest, it is not
obvious so far and may be discovered later.
The reason, why optimiser likes NestLoop on big data looks enigmatic.
Attempting to increase a cost of unnest routing from 1 to 1E5 we don't
see any changes in decision. In my opinion, the key reason here may be
triggered by unusual width of the JOIN result:
Hash Join (cost=0.24..0.47 rows=10 width=0)
In my opinion, the cost model can provide too low cost of the join and
it is a reason why upper NestLoop looks better than HashJoin.

I don't have any general recommendations to resolve this issue, but this
case should be discovered by the core developers.

[1] https://github.com/postgrespro/aqo

--
regards,
Andrei Lepikhov
Postgres Professional

Attachment	Content-Type	Size
NLvsHJ.txt	text/plain	6.2 KB

In response to

Distinct performance dropped by multiple times in v16 at 2024-06-10 06:59:04 from Vitaliy Litovskiy

Browse pgsql-performance by date

	From	Date	Subject
Next Message	nikhil kumar	2024-06-11 19:58:12	Postgresql initialize error
Previous Message	Greg Sabino Mullane	2024-06-10 16:24:29	Re: Distinct performance dropped by multiple times in v16