Feature: Support Sample

### Is your feature request related to a problem or challenge?

The TABLESAMPLE statement is used to sample the table.

Different DBs have different sample implementations.

Spark:

https://spark.apache.org/docs/latest/sql-ref-syntax-qry-select-sampling.html

example with replacement, poisson sample.
https://spark.apache.org/docs/latest/api/python/reference/pyspark.sql/api/pyspark.sql.DataFrame.sample.html

it implements
1. Sample the table down to the given number of rows.
2. Sample the table down to the given percentage.    
    a. poisson sample, (only in dataframe api)
    b. bernoulli sample.

Spark introduced a Sample logical plan, and many other dataframe apis are also based on this logical plan. e.g. 
https://spark.apache.org/docs/latest/api/python/reference/pyspark.sql/api/pyspark.sql.DataFrame.randomSplit.html

Hive:
https://cwiki.apache.org/confluence/display/hive/languagemanual+sampling

1. Sample the table down to the given number of rows.
2. Sample the table down to the given percentage. (bernoulli)
3. Sample  on column, it's useful for clustered table.

Clickhouse:

https://clickhouse.com/docs/sql-reference/statements/select/sample

1. Sample the table down to the given number of rows.
2. Sample the table down to the given percentage. 
3. Sample with offset.

Postgres:

https://wiki.postgresql.org/wiki/TABLESAMPLE_Implementation
1. BERNOULLI sample.
2. SYSTEM





### Describe the solution you'd like

Add a Sample logical plan.

### Describe alternatives you've considered

I have considered resusing current logical plan, e.g. Filter. But it seems that it's hard to implement poisson sample with current logical plan.
in spark, real_seed = input_seed + partition_id. then different partitions have different sample results. it makes sense to me. it's also hard to implement with current logical plan.

### Additional context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature: Support Sample #16533

Is your feature request related to a problem or challenge?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature: Support Sample #16533

Description

Is your feature request related to a problem or challenge?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions