Skip to content
Discussion options

You must be logged in to vote

I think there are a few follow ups here. Let me know if I've missed anything.

1. Dataset Filter Pushdown

The Dataset API clearly states that all filters must be applied by the Scanner:

Scan will return only the rows matching the filter. If possible the predicate will be pushed down to exploit the partition information or internal metadata found in the data source, e.g. Parquet statistics. Otherwise filters the loaded RecordBatches before yielding them.

Vortex Dataset should indeed uphold that API. For any expression where push-down into Vortex is not supported, it must be successfully evaluated after reading the batch from Vortex, and before returning to the caller.

The problem is that …

Replies: 4 comments 7 replies

Comment options

You must be logged in to vote
1 reply
@paultiq
Comment options

Comment options

You must be logged in to vote
6 replies
@infogulch
Comment options

@danking
Comment options

@infogulch
Comment options

@danking
Comment options

@danking
Comment options

Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Answer selected by danking
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
6 participants