From: Tom Lane Date: Wed, 14 Jun 2017 03:46:39 +0000 (-0400) Subject: Disallow set-returning functions inside CASE or COALESCE. X-Git-Tag: REL_10_BETA2~156 X-Git-Url: http://git.postgresql.org/gitweb/?a=commitdiff_plain;h=0436f6bde8848b7135f19dd7f8548b8c2ae89a34;p=postgresql.git Disallow set-returning functions inside CASE or COALESCE. When we reimplemented SRFs in commit 69f4b9c85, our initial choice was to allow the behavior to vary from historical practice in cases where a SRF call appeared within a conditional-execution construct (currently, only CASE or COALESCE). But that was controversial to begin with, and subsequent discussion has resulted in a consensus that it's better to throw an error instead of executing the query differently from before, so long as we can provide a reasonably clear error message and a way to rewrite the query. Hence, add a parser mechanism to allow detection of such cases during parse analysis. The mechanism just requires storing, in the ParseState, a pointer to the set-returning FuncExpr or OpExpr most recently emitted by parse analysis. Then the parsing functions for CASE and COALESCE can detect the presence of a SRF in their arguments by noting whether this pointer changes while analyzing their arguments. Furthermore, if it does, it provides a suitable error cursor location for the complaint. (This means that if there's more than one SRF in the arguments, the error will point at the last one to be analyzed not the first. While connoisseurs of parsing behavior might find that odd, it's unlikely the average user would ever notice.) While at it, we can also provide more specific error messages than before about some pre-existing restrictions, such as no-SRFs-within-aggregates. Also, reject at parse time cases where a NULLIF or IS DISTINCT FROM construct would need to return a set. We've never supported that, but the restriction is depended on in more subtle ways now, so it seems wise to detect it at the start. Also, provide some documentation about how to rewrite a SRF-within-CASE query using a custom wrapper SRF. It turns out that the information_schema.user_mapping_options view contained an instance of exactly the behavior we're now forbidding; but rewriting it makes it more clear and safer too. initdb forced because of user_mapping_options change. Patch by me, with error message suggestions from Alvaro Herrera and Andres Freund, pursuant to a complaint from Regina Obe. Discussion: https://postgr.es/m/000001d2d5de$d8d66170$8a832450$@pcorp.us --- diff --git a/doc/src/sgml/xfunc.sgml b/doc/src/sgml/xfunc.sgml index 05f4312bf3e..1a6c3b9bc2f 100644 --- a/doc/src/sgml/xfunc.sgml +++ b/doc/src/sgml/xfunc.sgml @@ -997,6 +997,29 @@ SELECT name, listchildren(name) FROM nodes; the LATERAL syntax. + + PostgreSQL's behavior for a set-returning function in a + query's select list is almost exactly the same as if the set-returning + function had been written in a LATERAL FROM-clause item + instead. For example, + +SELECT x, generate_series(1,5) AS g FROM tab; + + is almost equivalent to + +SELECT x, g FROM tab, LATERAL generate_series(1,5) AS g; + + It would be exactly the same, except that in this specific example, + the planner could choose to put g on the outside of the + nestloop join, since g has no actual lateral dependency + on tab. That would result in a different output row + order. Set-returning functions in the select list are always evaluated + as though they are on the inside of a nestloop join with the rest of + the FROM clause, so that the function(s) are run to + completion before the next row from the FROM clause is + considered. + + If there is more than one set-returning function in the query's select list, the behavior is similar to what you get from putting the functions @@ -1028,32 +1051,19 @@ SELECT srf1(srf2(x), srf3(y)), srf4(srf5(z)) FROM tab; - This behavior also means that set-returning functions will be evaluated - even when it might appear that they should be skipped because of a - conditional-evaluation construct, such as CASE - or COALESCE. For example, consider + Set-returning functions cannot be used within conditional-evaluation + constructs, such as CASE or COALESCE. For + example, consider SELECT x, CASE WHEN x > 0 THEN generate_series(1, 5) ELSE 0 END FROM tab; - It might seem that this should produce five repetitions of input - rows that have x > 0, and a single repetition of those - that do not; but actually it will produce five repetitions of every - input row. This is because generate_series() is run first, - and then the CASE expression is applied to its result rows. - The behavior is thus comparable to - -SELECT x, CASE WHEN x > 0 THEN g ELSE 0 END - FROM tab, LATERAL generate_series(1,5) AS g; - - It would be exactly the same, except that in this specific example, - the planner could choose to put g on the outside of the - nestloop join, since g has no actual lateral dependency - on tab. That would result in a different output row - order. Set-returning functions in the select list are always evaluated - as though they are on the inside of a nestloop join with the rest of - the FROM clause, so that the function(s) are run to - completion before the next row from the FROM clause is - considered. + It might seem that this should produce five repetitions of input rows + that have x > 0, and a single repetition of those that do + not; but actually, because generate_series(1, 5) would be + run in an implicit LATERAL FROM item before + the CASE expression is ever evaluated, it would produce five + repetitions of every input row. To reduce confusion, such cases produce + a parse-time error instead. @@ -1078,11 +1088,34 @@ SELECT x, CASE WHEN x > 0 THEN g ELSE 0 END functions. Also, nested set-returning functions did not work as described above; instead, a set-returning function could have at most one set-returning argument, and each nest of set-returning functions - was run independently. The behavior for conditional execution - (set-returning functions inside CASE etc) was different too. + was run independently. Also, conditional execution (set-returning + functions inside CASE etc) was previously allowed, + complicating things even more. Use of the LATERAL syntax is recommended when writing queries that need to work in older PostgreSQL versions, because that will give consistent results across different versions. + If you have a query that is relying on conditional execution of a + set-returning function, you may be able to fix it by moving the + conditional test into a custom set-returning function. For example, + +SELECT x, CASE WHEN y > 0 THEN generate_series(1, z) ELSE 5 END FROM tab; + + could become + +CREATE FUNCTION case_generate_series(cond bool, start int, fin int, els int) + RETURNS SETOF int AS $$ +BEGIN + IF cond THEN + RETURN QUERY SELECT generate_series(start, fin); + ELSE + RETURN QUERY SELECT els; + END IF; +END$$ LANGUAGE plpgsql; + +SELECT x, case_generate_series(y > 0, 1, z, 5) FROM tab; + + This formulation will work the same in all versions + of PostgreSQL. diff --git a/src/backend/catalog/information_schema.sql b/src/backend/catalog/information_schema.sql index cbcd6cfbc17..98bcfa08c65 100644 --- a/src/backend/catalog/information_schema.sql +++ b/src/backend/catalog/information_schema.sql @@ -2936,12 +2936,14 @@ CREATE VIEW user_mapping_options AS SELECT authorization_identifier, foreign_server_catalog, foreign_server_name, - CAST((pg_options_to_table(um.umoptions)).option_name AS sql_identifier) AS option_name, + CAST(opts.option_name AS sql_identifier) AS option_name, CAST(CASE WHEN (umuser <> 0 AND authorization_identifier = current_user) OR (umuser = 0 AND pg_has_role(srvowner, 'USAGE')) - OR (SELECT rolsuper FROM pg_authid WHERE rolname = current_user) THEN (pg_options_to_table(um.umoptions)).option_value + OR (SELECT rolsuper FROM pg_authid WHERE rolname = current_user) + THEN opts.option_value ELSE NULL END AS character_data) AS option_value - FROM _pg_user_mappings um; + FROM _pg_user_mappings um, + pg_options_to_table(um.umoptions) opts; GRANT SELECT ON user_mapping_options TO PUBLIC; diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c index a35ba32e6dd..89aea2fe5c5 100644 --- a/src/backend/executor/functions.c +++ b/src/backend/executor/functions.c @@ -388,6 +388,7 @@ sql_fn_post_column_ref(ParseState *pstate, ColumnRef *cref, Node *var) param = ParseFuncOrColumn(pstate, list_make1(subfield), list_make1(param), + pstate->p_last_srf, NULL, cref->location); } diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c index efe1c371efc..6218ee94fa4 100644 --- a/src/backend/parser/parse_agg.c +++ b/src/backend/parser/parse_agg.c @@ -705,6 +705,14 @@ check_agg_arguments_walker(Node *node, } /* Continue and descend into subtree */ } + /* We can throw error on sight for a set-returning function */ + if ((IsA(node, FuncExpr) &&((FuncExpr *) node)->funcretset) || + (IsA(node, OpExpr) &&((OpExpr *) node)->opretset)) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("aggregate function calls cannot contain set-returning function calls"), + errhint("You might be able to move the set-returning function into a LATERAL FROM item."), + parser_errposition(context->pstate, exprLocation(node)))); /* We can throw error on sight for a window function */ if (IsA(node, WindowFunc)) ereport(ERROR, diff --git a/src/backend/parser/parse_clause.c b/src/backend/parser/parse_clause.c index 27dd49d3019..3d5b20836f6 100644 --- a/src/backend/parser/parse_clause.c +++ b/src/backend/parser/parse_clause.c @@ -572,6 +572,8 @@ transformRangeFunction(ParseState *pstate, RangeFunction *r) List *pair = (List *) lfirst(lc); Node *fexpr; List *coldeflist; + Node *newfexpr; + Node *last_srf; /* Disassemble the function-call/column-def-list pairs */ Assert(list_length(pair) == 2); @@ -618,13 +620,25 @@ transformRangeFunction(ParseState *pstate, RangeFunction *r) Node *arg = (Node *) lfirst(lc); FuncCall *newfc; + last_srf = pstate->p_last_srf; + newfc = makeFuncCall(SystemFuncName("unnest"), list_make1(arg), fc->location); - funcexprs = lappend(funcexprs, - transformExpr(pstate, (Node *) newfc, - EXPR_KIND_FROM_FUNCTION)); + newfexpr = transformExpr(pstate, (Node *) newfc, + EXPR_KIND_FROM_FUNCTION); + + /* nodeFunctionscan.c requires SRFs to be at top level */ + if (pstate->p_last_srf != last_srf && + pstate->p_last_srf != newfexpr) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("set-returning functions must appear at top level of FROM"), + parser_errposition(pstate, + exprLocation(pstate->p_last_srf)))); + + funcexprs = lappend(funcexprs, newfexpr); funcnames = lappend(funcnames, FigureColname((Node *) newfc)); @@ -638,9 +652,21 @@ transformRangeFunction(ParseState *pstate, RangeFunction *r) } /* normal case ... */ - funcexprs = lappend(funcexprs, - transformExpr(pstate, fexpr, - EXPR_KIND_FROM_FUNCTION)); + last_srf = pstate->p_last_srf; + + newfexpr = transformExpr(pstate, fexpr, + EXPR_KIND_FROM_FUNCTION); + + /* nodeFunctionscan.c requires SRFs to be at top level */ + if (pstate->p_last_srf != last_srf && + pstate->p_last_srf != newfexpr) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("set-returning functions must appear at top level of FROM"), + parser_errposition(pstate, + exprLocation(pstate->p_last_srf)))); + + funcexprs = lappend(funcexprs, newfexpr); funcnames = lappend(funcnames, FigureColname(fexpr)); diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c index 92101c9103d..65bd51e1fd6 100644 --- a/src/backend/parser/parse_expr.c +++ b/src/backend/parser/parse_expr.c @@ -118,8 +118,7 @@ static Node *transformCurrentOfExpr(ParseState *pstate, CurrentOfExpr *cexpr); static Node *transformColumnRef(ParseState *pstate, ColumnRef *cref); static Node *transformWholeRowRef(ParseState *pstate, RangeTblEntry *rte, int location); -static Node *transformIndirection(ParseState *pstate, Node *basenode, - List *indirection); +static Node *transformIndirection(ParseState *pstate, A_Indirection *ind); static Node *transformTypeCast(ParseState *pstate, TypeCast *tc); static Node *transformCollateClause(ParseState *pstate, CollateClause *c); static Node *make_row_comparison_op(ParseState *pstate, List *opname, @@ -192,14 +191,8 @@ transformExprRecurse(ParseState *pstate, Node *expr) } case T_A_Indirection: - { - A_Indirection *ind = (A_Indirection *) expr; - - result = transformExprRecurse(pstate, ind->arg); - result = transformIndirection(pstate, result, - ind->indirection); - break; - } + result = transformIndirection(pstate, (A_Indirection *) expr); + break; case T_A_ArrayExpr: result = transformArrayExpr(pstate, (A_ArrayExpr *) expr, @@ -439,11 +432,12 @@ unknown_attribute(ParseState *pstate, Node *relref, char *attname, } static Node * -transformIndirection(ParseState *pstate, Node *basenode, List *indirection) +transformIndirection(ParseState *pstate, A_Indirection *ind) { - Node *result = basenode; + Node *last_srf = pstate->p_last_srf; + Node *result = transformExprRecurse(pstate, ind->arg); List *subscripts = NIL; - int location = exprLocation(basenode); + int location = exprLocation(result); ListCell *i; /* @@ -451,7 +445,7 @@ transformIndirection(ParseState *pstate, Node *basenode, List *indirection) * subscripting. Adjacent A_Indices nodes have to be treated as a single * multidimensional subscript operation. */ - foreach(i, indirection) + foreach(i, ind->indirection) { Node *n = lfirst(i); @@ -484,6 +478,7 @@ transformIndirection(ParseState *pstate, Node *basenode, List *indirection) newresult = ParseFuncOrColumn(pstate, list_make1(n), list_make1(result), + last_srf, NULL, location); if (newresult == NULL) @@ -632,6 +627,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref) node = ParseFuncOrColumn(pstate, list_make1(makeString(colname)), list_make1(node), + pstate->p_last_srf, NULL, cref->location); } @@ -678,6 +674,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref) node = ParseFuncOrColumn(pstate, list_make1(makeString(colname)), list_make1(node), + pstate->p_last_srf, NULL, cref->location); } @@ -737,6 +734,7 @@ transformColumnRef(ParseState *pstate, ColumnRef *cref) node = ParseFuncOrColumn(pstate, list_make1(makeString(colname)), list_make1(node), + pstate->p_last_srf, NULL, cref->location); } @@ -927,6 +925,8 @@ transformAExprOp(ParseState *pstate, A_Expr *a) else { /* Ordinary scalar operator */ + Node *last_srf = pstate->p_last_srf; + lexpr = transformExprRecurse(pstate, lexpr); rexpr = transformExprRecurse(pstate, rexpr); @@ -934,6 +934,7 @@ transformAExprOp(ParseState *pstate, A_Expr *a) a->name, lexpr, rexpr, + last_srf, a->location); } @@ -1053,6 +1054,7 @@ transformAExprNullIf(ParseState *pstate, A_Expr *a) a->name, lexpr, rexpr, + pstate->p_last_srf, a->location); /* @@ -1063,6 +1065,12 @@ transformAExprNullIf(ParseState *pstate, A_Expr *a) (errcode(ERRCODE_DATATYPE_MISMATCH), errmsg("NULLIF requires = operator to yield boolean"), parser_errposition(pstate, a->location))); + if (result->opretset) + ereport(ERROR, + (errcode(ERRCODE_DATATYPE_MISMATCH), + /* translator: %s is name of a SQL construct, eg NULLIF */ + errmsg("%s must not return a set", "NULLIF"), + parser_errposition(pstate, a->location))); /* * ... but the NullIfExpr will yield the first operand's type. @@ -1266,6 +1274,7 @@ transformAExprIn(ParseState *pstate, A_Expr *a) a->name, copyObject(lexpr), rexpr, + pstate->p_last_srf, a->location); } @@ -1430,6 +1439,7 @@ transformBoolExpr(ParseState *pstate, BoolExpr *a) static Node * transformFuncCall(ParseState *pstate, FuncCall *fn) { + Node *last_srf = pstate->p_last_srf; List *targs; ListCell *args; @@ -1465,6 +1475,7 @@ transformFuncCall(ParseState *pstate, FuncCall *fn) return ParseFuncOrColumn(pstate, fn->funcname, targs, + last_srf, fn, fn->location); } @@ -1619,7 +1630,8 @@ transformMultiAssignRef(ParseState *pstate, MultiAssignRef *maref) static Node * transformCaseExpr(ParseState *pstate, CaseExpr *c) { - CaseExpr *newc; + CaseExpr *newc = makeNode(CaseExpr); + Node *last_srf = pstate->p_last_srf; Node *arg; CaseTestExpr *placeholder; List *newargs; @@ -1628,8 +1640,6 @@ transformCaseExpr(ParseState *pstate, CaseExpr *c) Node *defresult; Oid ptype; - newc = makeNode(CaseExpr); - /* transform the test expression, if any */ arg = transformExprRecurse(pstate, (Node *) c->arg); @@ -1741,6 +1751,17 @@ transformCaseExpr(ParseState *pstate, CaseExpr *c) "CASE/WHEN"); } + /* if any subexpression contained a SRF, complain */ + if (pstate->p_last_srf != last_srf) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + /* translator: %s is name of a SQL construct, eg GROUP BY */ + errmsg("set-returning functions are not allowed in %s", + "CASE"), + errhint("You might be able to move the set-returning function into a LATERAL FROM item."), + parser_errposition(pstate, + exprLocation(pstate->p_last_srf)))); + newc->location = c->location; return (Node *) newc; @@ -2177,6 +2198,7 @@ static Node * transformCoalesceExpr(ParseState *pstate, CoalesceExpr *c) { CoalesceExpr *newc = makeNode(CoalesceExpr); + Node *last_srf = pstate->p_last_srf; List *newargs = NIL; List *newcoercedargs = NIL; ListCell *args; @@ -2205,6 +2227,17 @@ transformCoalesceExpr(ParseState *pstate, CoalesceExpr *c) newcoercedargs = lappend(newcoercedargs, newe); } + /* if any subexpression contained a SRF, complain */ + if (pstate->p_last_srf != last_srf) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + /* translator: %s is name of a SQL construct, eg GROUP BY */ + errmsg("set-returning functions are not allowed in %s", + "COALESCE"), + errhint("You might be able to move the set-returning function into a LATERAL FROM item."), + parser_errposition(pstate, + exprLocation(pstate->p_last_srf)))); + newc->args = newcoercedargs; newc->location = c->location; return (Node *) newc; @@ -2793,7 +2826,8 @@ make_row_comparison_op(ParseState *pstate, List *opname, Node *rarg = (Node *) lfirst(r); OpExpr *cmp; - cmp = castNode(OpExpr, make_op(pstate, opname, larg, rarg, location)); + cmp = castNode(OpExpr, make_op(pstate, opname, larg, rarg, + pstate->p_last_srf, location)); /* * We don't use coerce_to_boolean here because we insist on the @@ -3000,12 +3034,19 @@ make_distinct_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree, { Expr *result; - result = make_op(pstate, opname, ltree, rtree, location); + result = make_op(pstate, opname, ltree, rtree, + pstate->p_last_srf, location); if (((OpExpr *) result)->opresulttype != BOOLOID) ereport(ERROR, (errcode(ERRCODE_DATATYPE_MISMATCH), errmsg("IS DISTINCT FROM requires = operator to yield boolean"), parser_errposition(pstate, location))); + if (((OpExpr *) result)->opretset) + ereport(ERROR, + (errcode(ERRCODE_DATATYPE_MISMATCH), + /* translator: %s is name of a SQL construct, eg NULLIF */ + errmsg("%s must not return a set", "IS DISTINCT FROM"), + parser_errposition(pstate, location))); /* * We rely on DistinctExpr and OpExpr being same struct diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c index 55853c20bb4..34f1cf82ee0 100644 --- a/src/backend/parser/parse_func.c +++ b/src/backend/parser/parse_func.c @@ -64,10 +64,14 @@ static Node *ParseComplexProjection(ParseState *pstate, char *funcname, * * The argument expressions (in fargs) must have been transformed * already. However, nothing in *fn has been transformed. + * + * last_srf should be a copy of pstate->p_last_srf from just before we + * started transforming fargs. If the caller knows that fargs couldn't + * contain any SRF calls, last_srf can just be pstate->p_last_srf. */ Node * ParseFuncOrColumn(ParseState *pstate, List *funcname, List *fargs, - FuncCall *fn, int location) + Node *last_srf, FuncCall *fn, int location) { bool is_column = (fn == NULL); List *agg_order = (fn ? fn->agg_order : NIL); @@ -628,7 +632,7 @@ ParseFuncOrColumn(ParseState *pstate, List *funcname, List *fargs, /* if it returns a set, check that's OK */ if (retset) - check_srf_call_placement(pstate, location); + check_srf_call_placement(pstate, last_srf, location); /* build the appropriate output structure */ if (fdresult == FUNCDETAIL_NORMAL) @@ -759,6 +763,17 @@ ParseFuncOrColumn(ParseState *pstate, List *funcname, List *fargs, errmsg("FILTER is not implemented for non-aggregate window functions"), parser_errposition(pstate, location))); + /* + * Window functions can't either take or return sets + */ + if (pstate->p_last_srf != last_srf) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("window function calls cannot contain set-returning function calls"), + errhint("You might be able to move the set-returning function into a LATERAL FROM item."), + parser_errposition(pstate, + exprLocation(pstate->p_last_srf)))); + if (retset) ereport(ERROR, (errcode(ERRCODE_INVALID_FUNCTION_DEFINITION), @@ -771,6 +786,10 @@ ParseFuncOrColumn(ParseState *pstate, List *funcname, List *fargs, retval = (Node *) wfunc; } + /* if it returns a set, remember it for error checks at higher levels */ + if (retset) + pstate->p_last_srf = retval; + return retval; } @@ -2083,9 +2102,13 @@ LookupAggWithArgs(ObjectWithArgs *agg, bool noError) * and throw a nice error if not. * * A side-effect is to set pstate->p_hasTargetSRFs true if appropriate. + * + * last_srf should be a copy of pstate->p_last_srf from just before we + * started transforming the function's arguments. This allows detection + * of whether the SRF's arguments contain any SRFs. */ void -check_srf_call_placement(ParseState *pstate, int location) +check_srf_call_placement(ParseState *pstate, Node *last_srf, int location) { const char *err; bool errkind; @@ -2121,7 +2144,15 @@ check_srf_call_placement(ParseState *pstate, int location) errkind = true; break; case EXPR_KIND_FROM_FUNCTION: - /* okay ... but we can't check nesting here */ + /* okay, but we don't allow nested SRFs here */ + /* errmsg is chosen to match transformRangeFunction() */ + /* errposition should point to the inner SRF */ + if (pstate->p_last_srf != last_srf) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("set-returning functions must appear at top level of FROM"), + parser_errposition(pstate, + exprLocation(pstate->p_last_srf)))); break; case EXPR_KIND_WHERE: errkind = true; @@ -2202,7 +2233,7 @@ check_srf_call_placement(ParseState *pstate, int location) err = _("set-returning functions are not allowed in trigger WHEN conditions"); break; case EXPR_KIND_PARTITION_EXPRESSION: - err = _("set-returning functions are not allowed in partition key expression"); + err = _("set-returning functions are not allowed in partition key expressions"); break; /* diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c index e40b10d4f61..4b1db76e19f 100644 --- a/src/backend/parser/parse_oper.c +++ b/src/backend/parser/parse_oper.c @@ -735,12 +735,14 @@ op_error(ParseState *pstate, List *op, char oprkind, * Transform operator expression ensuring type compatibility. * This is where some type conversion happens. * - * As with coerce_type, pstate may be NULL if no special unknown-Param - * processing is wanted. + * last_srf should be a copy of pstate->p_last_srf from just before we + * started transforming the operator's arguments; this is used for nested-SRF + * detection. If the caller will throw an error anyway for a set-returning + * expression, it's okay to cheat and just pass pstate->p_last_srf. */ Expr * make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree, - int location) + Node *last_srf, int location) { Oid ltypeId, rtypeId; @@ -843,7 +845,11 @@ make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree, /* if it returns a set, check that's OK */ if (result->opretset) - check_srf_call_placement(pstate, location); + { + check_srf_call_placement(pstate, last_srf, location); + /* ... and remember it for error checks at higher levels */ + pstate->p_last_srf = (Node *) result; + } ReleaseSysCache(tup); diff --git a/src/include/catalog/catversion.h b/src/include/catalog/catversion.h index ae34f75a849..8b2ec21509a 100644 --- a/src/include/catalog/catversion.h +++ b/src/include/catalog/catversion.h @@ -53,6 +53,6 @@ */ /* yyyymmddN */ -#define CATALOG_VERSION_NO 201706131 +#define CATALOG_VERSION_NO 201706141 #endif diff --git a/src/include/parser/parse_func.h b/src/include/parser/parse_func.h index 5be1812242a..68572e9d2a9 100644 --- a/src/include/parser/parse_func.h +++ b/src/include/parser/parse_func.h @@ -31,7 +31,7 @@ typedef enum extern Node *ParseFuncOrColumn(ParseState *pstate, List *funcname, List *fargs, - FuncCall *fn, int location); + Node *last_srf, FuncCall *fn, int location); extern FuncDetailCode func_get_detail(List *funcname, List *fargs, List *fargnames, @@ -67,6 +67,7 @@ extern Oid LookupFuncWithArgs(ObjectWithArgs *func, extern Oid LookupAggWithArgs(ObjectWithArgs *agg, bool noError); -extern void check_srf_call_placement(ParseState *pstate, int location); +extern void check_srf_call_placement(ParseState *pstate, Node *last_srf, + int location); #endif /* PARSE_FUNC_H */ diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h index 0b54840e295..6a3507f3b14 100644 --- a/src/include/parser/parse_node.h +++ b/src/include/parser/parse_node.h @@ -157,6 +157,9 @@ typedef Node *(*CoerceParamHook) (ParseState *pstate, Param *param, * p_hasAggs, p_hasWindowFuncs, etc: true if we've found any of the indicated * constructs in the query. * + * p_last_srf: the set-returning FuncExpr or OpExpr most recently found in + * the query, or NULL if none. + * * p_pre_columnref_hook, etc: optional parser hook functions for modifying the * interpretation of ColumnRefs and ParamRefs. * @@ -199,6 +202,8 @@ struct ParseState bool p_hasSubLinks; bool p_hasModifyingCTE; + Node *p_last_srf; /* most recent set-returning func/op found */ + /* * Optional hook functions for parser callbacks. These are null unless * set up by the caller of make_parsestate. diff --git a/src/include/parser/parse_oper.h b/src/include/parser/parse_oper.h index d783b37f0f5..ab3c4aa62f6 100644 --- a/src/include/parser/parse_oper.h +++ b/src/include/parser/parse_oper.h @@ -59,7 +59,7 @@ extern Oid oprfuncid(Operator op); /* Build expression tree for an operator invocation */ extern Expr *make_op(ParseState *pstate, List *opname, - Node *ltree, Node *rtree, int location); + Node *ltree, Node *rtree, Node *last_srf, int location); extern Expr *make_scalar_array_op(ParseState *pstate, List *opname, bool useOr, Node *ltree, Node *rtree, int location); diff --git a/src/test/regress/expected/create_table.out b/src/test/regress/expected/create_table.out index 5136506dff6..fb8745be04f 100644 --- a/src/test/regress/expected/create_table.out +++ b/src/test/regress/expected/create_table.out @@ -315,7 +315,7 @@ CREATE FUNCTION retset (a int) RETURNS SETOF int AS $$ SELECT 1; $$ LANGUAGE SQL CREATE TABLE partitioned ( a int ) PARTITION BY RANGE (retset(a)); -ERROR: set-returning functions are not allowed in partition key expression +ERROR: set-returning functions are not allowed in partition key expressions DROP FUNCTION retset(int); CREATE TABLE partitioned ( a int diff --git a/src/test/regress/expected/rangefuncs.out b/src/test/regress/expected/rangefuncs.out index 4b6170dea5a..5c82614e051 100644 --- a/src/test/regress/expected/rangefuncs.out +++ b/src/test/regress/expected/rangefuncs.out @@ -1969,18 +1969,6 @@ select * from foobar(); -- fail ERROR: function return row and query-specified return row do not match DETAIL: Returned row contains 3 attributes, but query expects 2. drop function foobar(); --- check behavior when a function's input sometimes returns a set (bug #8228) -SELECT *, - lower(CASE WHEN id = 2 THEN (regexp_matches(str, '^0*([1-9]\d+)$'))[1] - ELSE str - END) -FROM - (VALUES (1,''), (2,'0000000049404'), (3,'FROM 10000000876')) v(id, str); - id | str | lower -----+---------------+------- - 2 | 0000000049404 | 49404 -(1 row) - -- check whole-row-Var handling in nested lateral functions (bug #11703) create function extractq2(t int8_tbl) returns int8 as $$ select t.q2 diff --git a/src/test/regress/expected/tsrf.out b/src/test/regress/expected/tsrf.out index c8ae361e756..ef02ba74655 100644 --- a/src/test/regress/expected/tsrf.out +++ b/src/test/regress/expected/tsrf.out @@ -41,6 +41,11 @@ SELECT generate_series(1, generate_series(1, 3)); 3 (6 rows) +-- but we've traditionally rejected the same in FROM +SELECT * FROM generate_series(1, generate_series(1, 3)); +ERROR: set-returning functions must appear at top level of FROM +LINE 1: SELECT * FROM generate_series(1, generate_series(1, 3)); + ^ -- srf, with two SRF arguments SELECT generate_series(generate_series(1,3), generate_series(2, 4)); generate_series @@ -190,16 +195,29 @@ SELECT few.dataa, count(*) FROM few WHERE dataa = 'a' GROUP BY few.dataa, unnest a | 4 (2 rows) +-- SRFs are not allowed if they'd need to be conditionally executed +SELECT q1, case when q1 > 0 then generate_series(1,3) else 0 end FROM int8_tbl; +ERROR: set-returning functions are not allowed in CASE +LINE 1: SELECT q1, case when q1 > 0 then generate_series(1,3) else 0... + ^ +HINT: You might be able to move the set-returning function into a LATERAL FROM item. +SELECT q1, coalesce(generate_series(1,3), 0) FROM int8_tbl; +ERROR: set-returning functions are not allowed in COALESCE +LINE 1: SELECT q1, coalesce(generate_series(1,3), 0) FROM int8_tbl; + ^ +HINT: You might be able to move the set-returning function into a LATERAL FROM item. -- SRFs are not allowed in aggregate arguments SELECT min(generate_series(1, 3)) FROM few; -ERROR: set-valued function called in context that cannot accept a set +ERROR: aggregate function calls cannot contain set-returning function calls LINE 1: SELECT min(generate_series(1, 3)) FROM few; ^ +HINT: You might be able to move the set-returning function into a LATERAL FROM item. -- SRFs are not allowed in window function arguments, either SELECT min(generate_series(1, 3)) OVER() FROM few; -ERROR: set-valued function called in context that cannot accept a set +ERROR: window function calls cannot contain set-returning function calls LINE 1: SELECT min(generate_series(1, 3)) OVER() FROM few; ^ +HINT: You might be able to move the set-returning function into a LATERAL FROM item. -- SRFs are normally computed after window functions SELECT id,lag(id) OVER(), count(*) OVER(), generate_series(1,3) FROM few; id | lag | count | generate_series @@ -427,7 +445,7 @@ SELECT int4mul(generate_series(1,2), 10); -- but SRFs in function RTEs must be at top level (annoying restriction) SELECT * FROM int4mul(generate_series(1,2), 10); -ERROR: set-valued function called in context that cannot accept a set +ERROR: set-returning functions must appear at top level of FROM LINE 1: SELECT * FROM int4mul(generate_series(1,2), 10); ^ -- DISTINCT ON is evaluated before tSRF evaluation if SRF is not diff --git a/src/test/regress/sql/rangefuncs.sql b/src/test/regress/sql/rangefuncs.sql index 4ed84b1f2b3..442d397d4a6 100644 --- a/src/test/regress/sql/rangefuncs.sql +++ b/src/test/regress/sql/rangefuncs.sql @@ -600,15 +600,6 @@ select * from foobar(); -- fail drop function foobar(); --- check behavior when a function's input sometimes returns a set (bug #8228) - -SELECT *, - lower(CASE WHEN id = 2 THEN (regexp_matches(str, '^0*([1-9]\d+)$'))[1] - ELSE str - END) -FROM - (VALUES (1,''), (2,'0000000049404'), (3,'FROM 10000000876')) v(id, str); - -- check whole-row-Var handling in nested lateral functions (bug #11703) create function extractq2(t int8_tbl) returns int8 as $$ diff --git a/src/test/regress/sql/tsrf.sql b/src/test/regress/sql/tsrf.sql index 417e78c53dd..45de099e4dc 100644 --- a/src/test/regress/sql/tsrf.sql +++ b/src/test/regress/sql/tsrf.sql @@ -14,6 +14,9 @@ SELECT generate_series(1, 2), generate_series(1,4); -- srf, with SRF argument SELECT generate_series(1, generate_series(1, 3)); +-- but we've traditionally rejected the same in FROM +SELECT * FROM generate_series(1, generate_series(1, 3)); + -- srf, with two SRF arguments SELECT generate_series(generate_series(1,3), generate_series(2, 4)); @@ -51,6 +54,10 @@ SELECT dataa, generate_series(1,1), count(*) FROM few GROUP BY 1, 2 HAVING count SELECT few.dataa, count(*) FROM few WHERE dataa = 'a' GROUP BY few.dataa ORDER BY 2; SELECT few.dataa, count(*) FROM few WHERE dataa = 'a' GROUP BY few.dataa, unnest('{1,1,3}'::int[]) ORDER BY 2; +-- SRFs are not allowed if they'd need to be conditionally executed +SELECT q1, case when q1 > 0 then generate_series(1,3) else 0 end FROM int8_tbl; +SELECT q1, coalesce(generate_series(1,3), 0) FROM int8_tbl; + -- SRFs are not allowed in aggregate arguments SELECT min(generate_series(1, 3)) FROM few;