diff options
| author | Tom Lane | 2025-12-29 18:01:27 +0000 |
|---|---|---|
| committer | Tom Lane | 2025-12-29 18:01:27 +0000 |
| commit | bd3e3e9e56830e1b1448b559743b0f0a90a6e38d (patch) | |
| tree | 658d674e6353934db8fc816e1496cfaf91f08dc0 /contrib/sepgsql | |
| parent | cb77bc04422fbbd1be48dde9794bfc0d0e5d7ad5 (diff) | |
estimate_hash_bucket_stats is defined to return zero to *mcv_freq if
it cannot obtain a value for the frequency of the most common value.
Its sole caller final_cost_hashjoin ignored this provision and would
blindly believe the zero value, resulting in computing zero for the
largest bucket size. In consequence, the safety check that intended
to prevent the largest bucket from exceeding get_hash_memory_limit()
was ineffective, allowing very silly plans to be chosen if statistics
were missing.
After fixing final_cost_hashjoin to disregard zero results for
mcv_freq, a second problem appeared: some cases that should use hash
joins failed to. This is because estimate_hash_bucket_stats was
unaware of the fact that ANALYZE won't store MCV statistics if it
doesn't find any multiply-occurring values. Thus the lack of an MCV
stats entry doesn't necessarily mean that we know nothing; we may
well know that the column is unique. The former coding returned zero
for *mcv_freq in this case, which was pretty close to correct, but now
final_cost_hashjoin doesn't believe it and disables the hash join.
So check to see if there is a HISTOGRAM stats entry; if so, ANALYZE
has in fact run for this column and must have found it to be unique.
In that case report the MCV frequency as 1 / rows, instead of claiming
ignorance.
Reporting a more accurate *mcv_freq in this case can also affect the
bucket-size skew adjustment further down in estimate_hash_bucket_stats,
causing hash-join cost estimates to change slightly. This affects
some plan choices in the core regression tests. The first diff in
join.out corresponds to a case where we have no stats and should not
risk a hash join, but the remaining changes are caused by producing
a better bucket-size estimate for unique join columns. Those are all
harmless changes so far as I can tell.
The existing behavior was introduced in commit 4867d7f62 in v11.
It appears from the commit log that disabling the bucket-size safety
check in the absence of statistics was intentional; but we've now seen
a case where the ensuing behavior is bad enough to make that seem like
a poor decision. In any case the lack of other problems with that
safety check after several years helps to justify enforcing it more
strictly. However, we won't risk back-patching this, in case any
applications are depending on the existing behavior.
Bug: #19363
Reported-by: Jinhui Lai <jinhui.lai@qq.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/2380165.1766871097@sss.pgh.pa.us
Discussion: https://postgr.es/m/19363-8dd32fc7600a1153@postgresql.org
Diffstat (limited to 'contrib/sepgsql')
0 files changed, 0 insertions, 0 deletions
