diff options
author | Tom Lane | 2015-06-03 15:58:47 +0000 |
---|---|---|
committer | Tom Lane | 2015-06-03 15:59:10 +0000 |
commit | 3f59be836c555fa679bbe0ec76de50a8b5cb23e0 (patch) | |
tree | 1d12706f003e66bb5bed5ce03f5119e1b3b169e8 /src/include | |
parent | 37013621f3b0e296aa71b812ca9d46871ead95e2 (diff) |
Fix planner's cost estimation for SEMI/ANTI joins with inner indexscans.
When the inner side of a nestloop SEMI or ANTI join is an indexscan that
uses all the join clauses as indexquals, it can be presumed that both
matched and unmatched outer rows will be processed very quickly: for
matched rows, we'll stop after fetching one row from the indexscan, while
for unmatched rows we'll have an indexscan that finds no matching index
entries, which should also be quick. The planner already knew about this,
but it was nonetheless charging for at least one full run of the inner
indexscan, as a consequence of concerns about the behavior of materialized
inner scans --- but those concerns don't apply in the fast case. If the
inner side has low cardinality (many matching rows) this could make an
indexscan plan look far more expensive than it actually is. To fix,
rearrange the work in initial_cost_nestloop/final_cost_nestloop so that we
don't add the inner scan cost until we've inspected the indexquals, and
then we can add either the full-run cost or just the first tuple's cost as
appropriate.
Experimentation with this fix uncovered another problem: add_path and
friends were coded to disregard cheap startup cost when considering
parameterized paths. That's usually okay (and desirable, because it thins
the path herd faster); but in this fast case for SEMI/ANTI joins, it could
result in throwing away the desired plain indexscan path in favor of a
bitmap scan path before we ever get to the join costing logic. In the
many-matching-rows cases of interest here, a bitmap scan will do a lot more
work than required, so this is a problem. To fix, add a per-relation flag
consider_param_startup that works like the existing consider_startup flag,
but applies to parameterized paths, and set it for relations that are the
inside of a SEMI or ANTI join.
To make this patch reasonably safe to back-patch, care has been taken to
avoid changing the planner's behavior except in the very narrow case of
SEMI/ANTI joins with inner indexscans. There are places in
compare_path_costs_fuzzily and add_path_precheck that are not terribly
consistent with the new approach, but changing them will affect planner
decisions at the margins in other cases, so we'll leave that for a
HEAD-only fix.
Back-patch to 9.3; before that, the consider_startup flag didn't exist,
meaning that the second aspect of the patch would be too invasive.
Per a complaint from Peter Holzer and analysis by Tomas Vondra.
Diffstat (limited to 'src/include')
-rw-r--r-- | src/include/nodes/relation.h | 8 |
1 files changed, 4 insertions, 4 deletions
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h index 279051ed18..33b0874570 100644 --- a/src/include/nodes/relation.h +++ b/src/include/nodes/relation.h @@ -321,8 +321,9 @@ typedef struct PlannerInfo * clauses have been applied (ie, output rows of a plan for it) * width - avg. number of bytes per tuple in the relation after the * appropriate projections have been done (ie, output width) - * consider_startup - true if there is any value in keeping paths for + * consider_startup - true if there is any value in keeping plain paths for * this rel on the basis of having cheap startup cost + * consider_param_startup - the same for parameterized paths * reltargetlist - List of Var and PlaceHolderVar nodes for the values * we need to output from this relation. * List is in no particular order, but all rels of an @@ -437,6 +438,7 @@ typedef struct RelOptInfo /* per-relation planner control flags */ bool consider_startup; /* keep cheap-startup-cost paths? */ + bool consider_param_startup; /* ditto, for parameterized paths? */ /* materialization information */ List *reltargetlist; /* Vars to be output by scan of relation */ @@ -1719,12 +1721,10 @@ typedef struct JoinCostWorkspace Cost run_cost; /* non-startup cost components */ /* private for cost_nestloop code */ + Cost inner_run_cost; /* also used by cost_mergejoin code */ Cost inner_rescan_run_cost; - double outer_matched_rows; - Selectivity inner_scan_frac; /* private for cost_mergejoin code */ - Cost inner_run_cost; double outer_rows; double inner_rows; double outer_skip_rows; |