diff options
| author | Tom Lane | 2020-01-09 16:56:59 +0000 |
|---|---|---|
| committer | Tom Lane | 2020-01-09 16:56:59 +0000 |
| commit | 9ce77d75c5ab094637cc4a446296dc3be6e3c221 (patch) | |
| tree | b23a084afddfe9d3326bd1d9d94a98304ad098f9 /src/include/nodes | |
| parent | ed10f32e37e9a16814c25e400d7826745ae3c797 (diff) | |
Reconsider the representation of join alias Vars.
The core idea of this patch is to make the parser generate join alias
Vars (that is, ones with varno pointing to a JOIN RTE) only when the
alias Var is actually different from any raw join input, that is a type
coercion and/or COALESCE is necessary to generate the join output value.
Otherwise just generate varno/varattno pointing to the relevant join
input column.
In effect, this means that the planner's flatten_join_alias_vars()
transformation is already done in the parser, for all cases except
(a) columns that are merged by JOIN USING and are transformed in the
process, and (b) whole-row join Vars. In principle that would allow
us to skip doing flatten_join_alias_vars() in many more queries than
we do now, but we don't have quite enough infrastructure to know that
we can do so --- in particular there's no cheap way to know whether
there are any whole-row join Vars. I'm not sure if it's worth the
trouble to add a Query-level flag for that, and in any case it seems
like fit material for a separate patch. But even without skipping the
work entirely, this should make flatten_join_alias_vars() faster,
particularly where there are nested joins that it previously had to
flatten recursively.
An essential part of this change is to replace Var nodes'
varnoold/varoattno fields with varnosyn/varattnosyn, which have
considerably more tightly-defined meanings than the old fields: when
they differ from varno/varattno, they identify the Var's position in
an aliased JOIN RTE, and the join alias is what ruleutils.c should
print for the Var. This is necessary because the varno change
destroyed ruleutils.c's ability to find the JOIN RTE from the Var's
varno.
Another way in which this change broke ruleutils.c is that it's no
longer feasible to determine, from a JOIN RTE's joinaliasvars list,
which join columns correspond to which columns of the join's immediate
input relations. (If those are sub-joins, the joinaliasvars entries
may point to columns of their base relations, not the sub-joins.)
But that was a horrid mess requiring a lot of fragile assumptions
already, so let's just bite the bullet and add some more JOIN RTE
fields to make it more straightforward to figure that out. I added
two integer-List fields containing the relevant column numbers from
the left and right input rels, plus a count of how many merged columns
there are.
This patch depends on the ParseNamespaceColumn infrastructure that
I added in commit 5815696bc. The biggest bit of code change is
restructuring transformFromClauseItem's handling of JOINs so that
the ParseNamespaceColumn data is propagated upward correctly.
Other than that and the ruleutils fixes, everything pretty much
just works, though some processing is now inessential. I grabbed
two pieces of low-hanging fruit in that line:
1. In find_expr_references, we don't need to recurse into join alias
Vars anymore. There aren't any except for references to merged USING
columns, which are more properly handled when we scan the join's RTE.
This change actually fixes an edge-case issue: we will now record a
dependency on any type-coercion function present in a USING column's
joinaliasvar, even if that join column has no references in the query
text. The odds of the missing dependency causing a problem seem quite
small: you'd have to posit somebody dropping an implicit cast between
two data types, without removing the types themselves, and then having
a stored rule containing a whole-row Var for a join whose USING merge
depends on that cast. So I don't feel a great need to change this in
the back branches. But in theory this way is more correct.
2. markRTEForSelectPriv and markTargetListOrigin don't need to recurse
into join alias Vars either, because the cases they care about don't
apply to alias Vars for USING columns that are semantically distinct
from the underlying columns. This removes the only case in which
markVarForSelectPriv could be called with NULL for the RTE, so adjust
the comments to describe that hack as being strictly internal to
markRTEForSelectPriv.
catversion bump required due to changes in stored rules.
Discussion: https://postgr.es/m/7115.1577986646@sss.pgh.pa.us
Diffstat (limited to 'src/include/nodes')
| -rw-r--r-- | src/include/nodes/parsenodes.h | 27 | ||||
| -rw-r--r-- | src/include/nodes/primnodes.h | 42 |
2 files changed, 52 insertions, 17 deletions
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h index f67bd9fad59..cdfa0568f7d 100644 --- a/src/include/nodes/parsenodes.h +++ b/src/include/nodes/parsenodes.h @@ -1020,14 +1020,35 @@ typedef struct RangeTblEntry * be a Var of one of the join's input relations, or such a Var with an * implicit coercion to the join's output column type, or a COALESCE * expression containing the two input column Vars (possibly coerced). - * Within a Query loaded from a stored rule, it is also possible for + * Elements beyond the first joinmergedcols entries are always just Vars, + * and are never referenced from elsewhere in the query (that is, join + * alias Vars are generated only for merged columns). We keep these + * entries only because they're needed in expandRTE() and similar code. + * + * Within a Query loaded from a stored rule, it is possible for non-merged * joinaliasvars items to be null pointers, which are placeholders for * (necessarily unreferenced) columns dropped since the rule was made. * Also, once planning begins, joinaliasvars items can be almost anything, * as a result of subquery-flattening substitutions. + * + * joinleftcols is an integer list of physical column numbers of the left + * join input rel that are included in the join; likewise joinrighttcols + * for the right join input rel. (Which rels those are can be determined + * from the associated JoinExpr.) If the join is USING/NATURAL, then the + * first joinmergedcols entries in each list identify the merged columns. + * The merged columns come first in the join output, then remaining + * columns of the left input, then remaining columns of the right. + * + * Note that input columns could have been dropped after creation of a + * stored rule, if they are not referenced in the query (in particular, + * merged columns could not be dropped); this is not accounted for in + * joinleftcols/joinrighttcols. */ JoinType jointype; /* type of join */ + int joinmergedcols; /* number of merged (JOIN USING) columns */ List *joinaliasvars; /* list of alias-var expansions */ + List *joinleftcols; /* left-side input column numbers */ + List *joinrightcols; /* right-side input column numbers */ /* * Fields valid for a function RTE (else NIL/zero): @@ -3313,8 +3334,8 @@ typedef struct ConstraintsSetStmt */ /* Reindex options */ -#define REINDEXOPT_VERBOSE (1 << 0) /* print progress info */ -#define REINDEXOPT_REPORT_PROGRESS (1 << 1) /* report pgstat progress */ +#define REINDEXOPT_VERBOSE (1 << 0) /* print progress info */ +#define REINDEXOPT_REPORT_PROGRESS (1 << 1) /* report pgstat progress */ typedef enum ReindexObjectType { diff --git a/src/include/nodes/primnodes.h b/src/include/nodes/primnodes.h index eb2cacb3a72..d73be2ad46c 100644 --- a/src/include/nodes/primnodes.h +++ b/src/include/nodes/primnodes.h @@ -141,18 +141,32 @@ typedef struct Expr /* * Var - expression node representing a variable (ie, a table column) * - * Note: during parsing/planning, varnoold/varoattno are always just copies - * of varno/varattno. At the tail end of planning, Var nodes appearing in - * upper-level plan nodes are reassigned to point to the outputs of their - * subplans; for example, in a join node varno becomes INNER_VAR or OUTER_VAR - * and varattno becomes the index of the proper element of that subplan's - * target list. Similarly, INDEX_VAR is used to identify Vars that reference - * an index column rather than a heap column. (In ForeignScan and CustomScan - * plan nodes, INDEX_VAR is abused to signify references to columns of a - * custom scan tuple type.) In all these cases, varnoold/varoattno hold the - * original values. The code doesn't really need varnoold/varoattno, but they - * are very useful for debugging and interpreting completed plans, so we keep - * them around. + * In the parser and planner, varno and varattno identify the semantic + * referent, which is a base-relation column unless the reference is to a join + * USING column that isn't semantically equivalent to either join input column + * (because it is a FULL join or the input column requires a type coercion). + * In those cases varno and varattno refer to the JOIN RTE. (Early in the + * planner, we replace such join references by the implied expression; but up + * till then we want join reference Vars to keep their original identity for + * query-printing purposes.) + * + * At the end of planning, Var nodes appearing in upper-level plan nodes are + * reassigned to point to the outputs of their subplans; for example, in a + * join node varno becomes INNER_VAR or OUTER_VAR and varattno becomes the + * index of the proper element of that subplan's target list. Similarly, + * INDEX_VAR is used to identify Vars that reference an index column rather + * than a heap column. (In ForeignScan and CustomScan plan nodes, INDEX_VAR + * is abused to signify references to columns of a custom scan tuple type.) + * + * In the parser, varnosyn and varattnosyn are either identical to + * varno/varattno, or they specify the column's position in an aliased JOIN + * RTE that hides the semantic referent RTE's refname. This is a syntactic + * identifier as opposed to the semantic identifier; it tells ruleutils.c + * how to print the Var properly. varnosyn/varattnosyn retain their values + * throughout planning and execution, so they are particularly helpful to + * identify Vars when debugging. Note, however, that a Var that is generated + * in the planner and doesn't correspond to any simple relation column may + * have varnosyn = varattnosyn = 0. */ #define INNER_VAR 65000 /* reference to inner subplan */ #define OUTER_VAR 65001 /* reference to outer subplan */ @@ -177,8 +191,8 @@ typedef struct Var Index varlevelsup; /* for subquery variables referencing outer * relations; 0 in a normal var, >0 means N * levels up */ - Index varnoold; /* original value of varno, for debugging */ - AttrNumber varoattno; /* original value of varattno */ + Index varnosyn; /* syntactic relation index (0 if unknown) */ + AttrNumber varattnosyn; /* syntactic attribute number */ int location; /* token location, or -1 if unknown */ } Var; |
