Skip to content

SanityChecker rejects certain valid UNION plans  #12446

@alamb

Description

@alamb

Describe the bug

There is a regression that was added that in a very very specific circumstance with sorted data and constant predicates and UNION queries where the query will now error with a SanityCheckPlan error when it should complete.

To Reproduce

@wiedld found a reproducer as part of #12414

c2e652e

# Test: inputs into union with different orderings
query TT
explain select * from (select b, c, a, NULL::int as a0 from ordered_table order by a, c) t1
union all
select * from (select b, c, NULL::int as a, a0 from ordered_table order by a0, c) t2
order by d, c, a, a0, b
limit 2;
----
logical_plan
01)Projection: t1.b, t1.c, t1.a, t1.a0
02)--Sort: t1.d ASC NULLS LAST, t1.c ASC NULLS LAST, t1.a ASC NULLS LAST, t1.a0 ASC NULLS LAST, t1.b ASC NULLS LAST, fetch=2
03)----Union
04)------SubqueryAlias: t1
05)--------Projection: ordered_table.b, ordered_table.c, ordered_table.a, Int32(NULL) AS a0, ordered_table.d
06)----------TableScan: ordered_table projection=[a, b, c, d]
07)------SubqueryAlias: t2
08)--------Projection: ordered_table.b, ordered_table.c, Int32(NULL) AS a, ordered_table.a0, ordered_table.d
09)----------TableScan: ordered_table projection=[a0, b, c, d]

# Test: run the query from above
# TODO: query fails since the constant columns t1.a0 and t2.a are not in the ORDER BY subquery,
# and SanityCheckPlan does not allow this.
statement error DataFusion error: SanityCheckPlan
select * from (select b, c, a, NULL::int as a0 from ordered_table order by a, c) t1
union all
select * from (select b, c, NULL::int as a, a0 from ordered_table order by a0, c) t2
order by d, c, a, a0, b
limit 2;

statement ok
drop table ordered_table;

Expected behavior

Query should run

Additional context

We believe this was uncovered by #11196 . The error in the sort order calculation has existed for awhile but #11196 now uncovered the issue

This was released in 40.0.0 https://github.com/apache/datafusion/blob/main/dev/changelog/40.0.0.md

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingregressionSomething that used to work no longer does

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions