Skip to content

Physical plan pushdown for volatile predicates #16545

Open
@theirix

Description

@theirix

Describe the bug

This is a follow-up to a discussion in #16325 (comment), which is not directly related to table sampling but could affect it.

I'd like to double-check if a volatile filter pushdown to a Parquet executor is expected. I had implemented the disabling of volatile pushdown filters for a logical plan in #13268. But it seems like the physical optimiser still pushes this predicate to an executor. Should we implement a similar mechanism to make volatile predicates as unsupported filters? In a current physical plan implementation, there is a concept of "unsupported" filters, which can be easily reused for it.

Current behaviour:

Before:

[2025-06-18T18:20:07Z TRACE datafusion::physical_planner] Optimized physical plan by LimitedDistinctAggregation:
    OutputRequirementExec
      ProjectionExec: expr=[count(Int64(1))@0 as count(*)]
        AggregateExec: mode=Final, gby=[], aggr=[count(Int64(1))]
          AggregateExec: mode=Partial, gby=[], aggr=[count(Int64(1))]
            FilterExec: random() < 0.1
              DataSourceExec: file_groups={1 group: [[sample.parquet]]}, file_type=parquet

After:

[2025-06-18T18:20:07Z TRACE datafusion::physical_planner] Optimized physical plan by FilterPushdown:
    OutputRequirementExec
      ProjectionExec: expr=[count(Int64(1))@0 as count(*)]
        AggregateExec: mode=Final, gby=[], aggr=[count(Int64(1))]
          AggregateExec: mode=Partial, gby=[], aggr=[count(Int64(1))]
            DataSourceExec: file_groups={1 group: [[sample.parquet]]}, file_type=parquet, predicate=random() < 0.1

To Reproduce

set datafusion.execution.parquet.pushdown_filters=true;
create external table data stored as parquet location 'sample.parquet';
SELECT count(*) FROM data WHERE random() < 0.1;

Expected behavior

I expect the physical plan optimiser doesn't perform pushdown of volatile predicates.

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions