Lists: | pgsql-performance |
---|
From: | pgdba <postgresql(at)inbox(dot)com> |
---|---|
To: | pgsql-performance(at)postgresql(dot)org |
Subject: | Incorrect row estimates in plan? |
Date: | 2007-09-26 14:22:45 |
Message-ID: | 12902068.post@talk.nabble.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-performance |
Hi, I am having some trouble understanding a plan and was wondering if anyone
could guide me. The query in question here seems to be showing some
incorrect row counts. I have vacuumed and analyzed the table, but the
estimate versus the actual total seems to be way out (est 2870 vs actual
85k). Perhaps I am reading the plan incorrectly though. (hopefully the plan
below is readable)
db=# select version();
version
-------------------------------------------------------------------------------------------------------------
PostgreSQL 8.2.4 on i486-pc-linux-gnu, compiled by GCC cc (GCC) 4.1.2
20061115
(prerelease) (Debian 4.1.1-21)
db=# show shared_buffers ;
shared_buffers
----------------
300MB
#4GB ram, 2 SATA striped, XFS
db=# show default_statistics_target;
default_statistics_target
---------------------------
100
# stats have been raised to 1000 on both the destip and srcip columns
# create index slog_gri_idx on slog (gid,rule,(case when rule in (8,9) then
destip else srcip end)) WHERE (rule in (1, 2, 8, 9, 10));
# vacuum analyze verbose slog;
db=# show random_page_cost ;
random_page_cost
------------------
3
db=# select count(*) from slog
count
---------
1,019,121
db=#select count(*) as total
from slog
where gid=10000::INTEGER
and rule in (1,2,8,9,10)
and (case when rule in (8,9) then destip else srcip
end)='192.168.10.23'::INET;
total
-------
83,538
# problematic query
explain analyze
select coalesce(uri,host((case when rule in (8,9) then srcip else destip
end))) as
destip,
case when rule in (8,9) then 'ext' else 'int' end as tp,
count(*) as total,
coalesce(sum(destbytes),0)+coalesce(sum(srcbytes),0) as bytes
from slog
where gid=10000::INTEGER
and rule in (1,2,8,9,10)
and (case when rule in (8,9) then destip else srcip
end)='192.168.10.23'::INET
group by destip,tp
order by bytes desc,total desc,destip limit 20
Limit (cost=6490.18..6490.23 rows=20 width=61) (actual
time=2036.968..2037.220 rows=20 loops=1)
-> Sort (cost=6490.18..6490.90 rows=288 width=61) (actual
time=2036.960..2037.027 rows=20 loops=1)
Sort Key: (COALESCE(sum(destbytes), 0::numeric) +
COALESCE(sum(srcbytes), 0::numeric)), count(*), COALESCE(uri, host(CASE WHEN
(rule = ANY ('{8,9}'::integer[])) THEN srcip ELSE destip END))
-> HashAggregate (cost=6470.50..6478.42 rows=288 width=61) (actual
time=2008.478..2022.125 rows=2057 loops=1)
-> Bitmap Heap Scan on slog (cost=82.98..6434.62 rows=2870
width=61) (actual time=50.235..1237.948 rows=83538 loops=1)
Recheck Cond: ((gid = 10000) AND (rule = ANY
('{1,2,8,9,10}'::integer[])) AND (CASE WHEN (rule = ANY
('{8,9}'::integer[])) THEN destip ELSE srcip END = '192.168.10.23'::inet))
-> Bitmap Index Scan on slog_gri_idx (cost=0.00..82.26
rows=2870 width=0) (actual time=41.306..41.306 rows=83538 loops=1)
Index Cond: ((gid = 10000) AND (rule = ANY
('{1,2,8,9,10}'::integer[])) AND (CASE WHEN (rule = ANY
('{8,9}'::integer[])) THEN destip ELSE srcip END = '192.168.10.23'::inet))
Total runtime: 2037.585 ms
Does anyone have any suggestions?
Thanks!
--
View this message in context: http://www.nabble.com/Incorrect-row-estimates-in-plan--tf4522692.html#a12902068
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | pgdba <postgresql(at)inbox(dot)com> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Incorrect row estimates in plan? |
Date: | 2007-09-26 14:45:14 |
Message-ID: | 20944.1190817914@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-performance |
pgdba <postgresql(at)inbox(dot)com> writes:
> -> Bitmap Heap Scan on slog (cost=82.98..6434.62 rows=2870
> width=61) (actual time=50.235..1237.948 rows=83538 loops=1)
> Recheck Cond: ((gid = 10000) AND (rule = ANY
> ('{1,2,8,9,10}'::integer[])) AND (CASE WHEN (rule = ANY
> ('{8,9}'::integer[])) THEN destip ELSE srcip END = '192.168.10.23'::inet))
> -> Bitmap Index Scan on slog_gri_idx (cost=0.00..82.26
> rows=2870 width=0) (actual time=41.306..41.306 rows=83538 loops=1)
> Index Cond: ((gid = 10000) AND (rule = ANY
> ('{1,2,8,9,10}'::integer[])) AND (CASE WHEN (rule = ANY
> ('{8,9}'::integer[])) THEN destip ELSE srcip END = '192.168.10.23'::inet))
[ blink... ] Pray tell, what is the definition of this index?
With such a bizarre scan condition, it's unlikely you'll get any really
accurate row estimate.
regards, tom lane
From: | pgdba <postgresql(at)inbox(dot)com> |
---|---|
To: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Incorrect row estimates in plan? |
Date: | 2007-09-26 15:24:01 |
Message-ID: | 12903194.post@talk.nabble.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-performance |
Hi Tom,
Tom Lane-2 wrote:
>
> pgdba <postgresql(at)inbox(dot)com> writes:
>> -> Bitmap Heap Scan on slog (cost=82.98..6434.62 rows=2870
>> width=61) (actual time=50.235..1237.948 rows=83538 loops=1)
>> Recheck Cond: ((gid = 10000) AND (rule = ANY
>> ('{1,2,8,9,10}'::integer[])) AND (CASE WHEN (rule = ANY
>> ('{8,9}'::integer[])) THEN destip ELSE srcip END =
>> '192.168.10.23'::inet))
>> -> Bitmap Index Scan on slog_gri_idx
>> (cost=0.00..82.26
>> rows=2870 width=0) (actual time=41.306..41.306 rows=83538 loops=1)
>> Index Cond: ((gid = 10000) AND (rule = ANY
>> ('{1,2,8,9,10}'::integer[])) AND (CASE WHEN (rule = ANY
>> ('{8,9}'::integer[])) THEN destip ELSE srcip END =
>> '192.168.10.23'::inet))
>
> [ blink... ] Pray tell, what is the definition of this index?
>
> With such a bizarre scan condition, it's unlikely you'll get any really
> accurate row estimate.
>
> regards, tom lane
>
>
Original index: "create index slog_gri_idx on slog (gid,rule,(case when rule
in (8,9) then
destip else srcip end)) WHERE (rule in (1, 2, 8, 9, 10))"
The purpose of that index is to match a specific query (one that gets run
frequently and needs to be fast). It is using the destip when rule 8/9, and
srcip when other, but only for a subset of the rules (1,2,8,9,10). There are
about 18 rules in total, but I'm only interested in those 5. I have tried a
couple of indices like:
create index test_destip_idx on slog (gid,destip) where rule in (8,9);
create index test_srcip_idx on slog (gid,srcip) where rule in (1,2,10);
But the original slog_gri_idx index was used instead. Is there a way that I
can rewrite that index then? Not that I'm a fan of a CASE statement in a
functional index, but I'm at a loss as to how else I can create this. Or
what else I can look into to make this faster?
--
View this message in context: http://www.nabble.com/Incorrect-row-estimates-in-plan--tf4522692.html#a12903194
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | pgdba <postgresql(at)inbox(dot)com> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Incorrect row estimates in plan? |
Date: | 2007-09-26 16:38:09 |
Message-ID: | 22791.1190824689@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-performance |
pgdba <postgresql(at)inbox(dot)com> writes:
> Tom Lane-2 wrote:
> -> Bitmap Index Scan on slog_gri_idx
> (cost=0.00..82.26
> rows=2870 width=0) (actual time=41.306..41.306 rows=83538 loops=1)
> Index Cond: ((gid = 10000) AND (rule = ANY
> ('{1,2,8,9,10}'::integer[])) AND (CASE WHEN (rule = ANY
> ('{8,9}'::integer[])) THEN destip ELSE srcip END =
> '192.168.10.23'::inet))
>>
>> [ blink... ] Pray tell, what is the definition of this index?
> Original index: "create index slog_gri_idx on slog (gid,rule,(case when rule
> in (8,9) then
> destip else srcip end)) WHERE (rule in (1, 2, 8, 9, 10))"
> The purpose of that index is to match a specific query (one that gets run
> frequently and needs to be fast).
Ah. I didn't think you would've put such a specific thing into an index
definition, but if you're stuck supporting such badly written queries,
maybe there's no other way.
I rather doubt that you're going to be able to make this query any
faster than it is, short of buying enough RAM to keep the whole table
RAM-resident. Pulling 80000 random rows in 1200 msec doesn't sound
all that slow to me.
The ultimate solution might be to rethink your table designs ...
regards, tom lane
From: | pgdba <postgresql(at)inbox(dot)com> |
---|---|
To: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Incorrect row estimates in plan? |
Date: | 2007-09-26 16:56:33 |
Message-ID: | 12905186.post@talk.nabble.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-performance |
Tom Lane-2 wrote:
>
> pgdba <postgresql(at)inbox(dot)com> writes:
>> Tom Lane-2 wrote:
>> -> Bitmap Index Scan on slog_gri_idx
>> (cost=0.00..82.26
>> rows=2870 width=0) (actual time=41.306..41.306 rows=83538 loops=1)
>> Index Cond: ((gid = 10000) AND (rule = ANY
>> ('{1,2,8,9,10}'::integer[])) AND (CASE WHEN (rule = ANY
>> ('{8,9}'::integer[])) THEN destip ELSE srcip END =
>> '192.168.10.23'::inet))
>>>
>>> [ blink... ] Pray tell, what is the definition of this index?
>
>> Original index: "create index slog_gri_idx on slog (gid,rule,(case when
>> rule
>> in (8,9) then
>> destip else srcip end)) WHERE (rule in (1, 2, 8, 9, 10))"
>
>> The purpose of that index is to match a specific query (one that gets run
>> frequently and needs to be fast).
>
> Ah. I didn't think you would've put such a specific thing into an index
> definition, but if you're stuck supporting such badly written queries,
> maybe there's no other way.
>
> I rather doubt that you're going to be able to make this query any
> faster than it is, short of buying enough RAM to keep the whole table
> RAM-resident. Pulling 80000 random rows in 1200 msec doesn't sound
> all that slow to me.
>
> The ultimate solution might be to rethink your table designs ...
>
> regards, tom lane
>
Badly written the query may be, but I do have the opportunity to change it.
Part of the problem is that I cannot come up with a better way of writing
it.
What about the discrepancy between the estimated row count and the actual
row count for that index access?
"Bitmap Index Scan on slog_gri_idx (cost=0.00..82.26 rows=2870 width=0)
(actual time=41.306..41.306 rows=83538 loops=1)"
Is there anything I can do to influence that (not that it is likely to
change the plan, but...). I vacuumed and analyzed after I created the index,
so the stats should be at least be close (with stats target set to 1000
there).
--
View this message in context: http://www.nabble.com/Incorrect-row-estimates-in-plan--tf4522692.html#a12905186
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.