-
Notifications
You must be signed in to change notification settings - Fork 966
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] In a Paimon primary key table, using ORC offers significantly higher efficiency for point lookups based on the primary key compared to Parquet. #4586
Comments
Can you turn on these two parameters and try again? @Aiden-Dong
see #4231 |
In addition, you can turn off executeFilter, because turning it on will cause reading to slow down. In addition, orc pushdown will ensure that only the required data is returned in most cases. |
是否是我这个表述存在问题, 我实际使用过程中发现 ORC 谓词下推能力更强, parquet 的谓词下推没有推到 column page 级别, 我最近这两天正在研究修复这个问题, 打算将 读取 parquet 的过滤谓词推到 page 级别, 减少不必要的数据扫描 |
是否是我这个表述存在问题, 我实际使用过程中发现 ORC 谓词下推能力更强, parquet 的谓词下推没有推到 column page 级别, 我最近这两天正在研究修复这个问题, 打算将 读取 parquet 的过滤谓词推到 page 级别, 减少不必要的数据扫描 Is there an issue with my statement? In my practical usage, I've found that ORC's predicate pushdown capability is stronger. Parquet's predicate pushdown hasn't reached the column page level. I've been working on a solution to this problem for the past couple of days, aiming to push down the filtering predicates of Parquet reads to the page level to reduce unnecessary data scans. |
I haven't done much research on the related implementation of Parquet. All our optimizations are centered around this ORC. I have tested that the default Paimon table configuration will not actually push the filter conditions to ORC for execution, so I raised PR #4231 to fix this problem. Parquet的相关实现我没有做过多研究,我们的全部优化都是围绕这ORC展开的。我测试过,默认的Paimon表配置是不会真正的把filter条件下推给ORC执行的,因此我提了pr #4231修复了这个问题。 |
Thank you for your suggestion. We will carefully consider it and try to implement it. |
Writing this way ensures that the data filtered by orc is equivalent to the given filter condition. But it also includes the plan time, so the comparison with the above is not accurate. 这样写可以确保orc过滤的数据和给定的filter条件等价。但是又包含了plan的时间,所以和上面对比不准确。 Table table = TableUtil.getTable(); // PrimaryKeyFileStoreTable
PredicateBuilder builder = new PredicateBuilder(
RowType.of(DataTypes.INT(),
DataTypes.STRING(),
DataTypes.STRING()));
int[] projection = new int[] {0, 1, 2};
ReadBuilder readBuilder = table.newReadBuilder()
.withProjection(projection);
Random random = new Random();
long startTime = System.currentTimeMillis();
for(int i = 0 ; i < 30 ; i ++){
InnerTableRead read = (InnerTableRead)readBuilder.newRead();
int key = random.nextInt(4000000);
Predicate keyFilter = builder.equal(0, key);
InnerTableScan tableScan = (InnerTableScan) readBuilder
.withFilter(keyFilter)
.newScan();
InnerTableScan innerTableScan = tableScan.withFilter(keyFilter);
TableScan.Plan plan = innerTableScan.plan();
List<Split> splits = plan.splits();
read.withFilter(keyFilter);//.executeFilter();
RecordReader<InternalRow> reader = read.createReader(splits);
reader.forEachRemaining(internalRow -> {
int f0 = internalRow.getInt(0);
String f1 = internalRow.getString(1).toString();
String f2 = internalRow.getString(2).toString();
System.out.println(String.format("%d - {%d, %s, %s}",key, f0, f1, f2));
});
}
long stopTime = System.currentTimeMillis();
System.out.println("time : " + (stopTime - startTime)); |
可以获取您的联系方式吗? 咨询个问题 |
The test result I used with the new code is that, including the planning time, the Orc format took 1.77s and the Parquet format took 12.95s. 我用新的代码测试结果是,在包含plan时间的情况下,Orc格式耗时1.77s,Parquet的耗时12.95s。 Orc table as {
"version" : 2,
"id" : 0,
"fields" : [ {
"id" : 0,
"name" : "f0",
"type" : "INT NOT NULL"
}, {
"id" : 1,
"name" : "f1",
"type" : "STRING"
}, {
"id" : 2,
"name" : "f2",
"type" : "STRING"
} ],
"highestFieldId" : 2,
"partitionKeys" : [ ],
"primaryKeys" : [ "f0" ],
"options" : {
"bucket" : "1",
"file.format" : "orc",
"manifest.compression" : "null",
"orc.reader.filter.use.selected":"true",
"orc.reader.sarg.to.filter":"true"
},
"timeMillis" : 1731654078602
} Parquet {
"version" : 2,
"id" : 0,
"fields" : [ {
"id" : 0,
"name" : "f0",
"type" : "INT NOT NULL"
}, {
"id" : 1,
"name" : "f1",
"type" : "STRING"
}, {
"id" : 2,
"name" : "f2",
"type" : "STRING"
} ],
"highestFieldId" : 2,
"partitionKeys" : [ ],
"primaryKeys" : [ "f0" ],
"options" : {
"bucket" : "1",
"file.format" : "parquet",
"manifest.compression" : "null"
},
"timeMillis" : 1731654078602
} After checking, after the filter is passed to orc, each query orc returns only 1 or 0 pieces of data, while basically all data is returned in Parquet format. 经过检查,filter传递给orc后,每个查询orc只返回1条或者0条数据,而Parquet格式下基本把所有数据都返回了。 Reader code as this : Table table = TableUtil.getTable(); // PrimaryKeyFileStoreTable
PredicateBuilder builder = new PredicateBuilder(
RowType.of(DataTypes.INT(),
DataTypes.STRING(),
DataTypes.STRING()));
int[] projection = new int[] {0, 1, 2};
ReadBuilder readBuilder = table.newReadBuilder()
.withProjection(projection);
Random random = new Random();
long startTime = System.currentTimeMillis();
for(int i = 0 ; i < 30 ; i ++){
InnerTableRead read = (InnerTableRead)readBuilder.newRead();
int key = random.nextInt(4000000);
Predicate keyFilter = builder.equal(0, key);
InnerTableScan tableScan = (InnerTableScan) readBuilder
.withFilter(keyFilter)
.newScan();
InnerTableScan innerTableScan = tableScan.withFilter(keyFilter);
TableScan.Plan plan = innerTableScan.plan();
List<Split> splits = plan.splits();
read.withFilter(keyFilter);//.executeFilter();
RecordReader<InternalRow> reader = read.createReader(splits);
reader.forEachRemaining(internalRow -> {
int f0 = internalRow.getInt(0);
String f1 = internalRow.getString(1).toString();
String f2 = internalRow.getString(2).toString();
System.out.println(String.format("%d - {%d, %s, %s}",key, f0, f1, f2));
});
}
long stopTime = System.currentTimeMillis();
System.out.println("time : " + (stopTime - startTime)); |
@ranxianglei The current version of Parquet only pushes filter predicates down to the RowGroup level during filtering. I’m currently working on optimizing this area. |
[email protected] my email @Aiden-Dong |
@Aiden-Dong 有最新的测试吗?不知道效果如何 Are there any latest tests? Don't know how effective it is |
我刚刚做了个本地测试,将测试结果补充到PR下面去了 |
Search before asking
Motivation
Basic Information
table
400w
1
7
The data sample :
The Write Example
The Read Example
random 30 for read
Time Consumption in ORC/Parquet
PARQUET
reader :17982ms
ORC
reader :1096ms
Root Cause Analysis
Under the current query predicate pushdown, in ORC, it can be pushed down to the column index level, whereas in Parquet, it is only pushed down to the row group level.
Solution
No response
Anything else?
No response
Are you willing to submit a PR?
The text was updated successfully, but these errors were encountered: