-
Notifications
You must be signed in to change notification settings - Fork 246
TiSpark Benchmark
shiyuhang0 edited this page Jul 6, 2022
·
4 revisions
Machine * 10
* CPU: 8 Intel Xeon Processor (Icelake)
* Memory: 32G
* Disk: 500G
TiDB 5.4.0: 3 TiDB + 3 TiKV + 1PD (TiDB and PD are in the same machine)
Spark 3.0.3 StandAlone: 1 master + 3 worker
Parallel number depends on the total number of executor cores = 3*8 = 24
Write data from HDFS to TiDB with Data generated by TPC-H (ORDERS table)
TiSpark Write bechmark
Count(*) | Data size | Tasknumber | Time(s) |
---|---|---|---|
1,500,000 | 164M | 9 | 62 |
15,000,000 | 1.7G | 23 | 396 |
150,000,000 | 17G | 226 | 4722 |
Spark JDBC Write benchmark
Count(*) | Data size | Tasknumber | Time(s) |
---|---|---|---|
1,500,000 | 164M | 24 | 23 |
15,000,000 | 1.7G | 24 | 244 |
150,000,000 | 17G | 133 | 2483 |
Delete data from TiDB with TiSpark (ORDERS table)
Count(*) | Data size | Tasknumber | Time(s) |
---|---|---|---|
1,500,000 | 164M | 3 | 31 |
15,000,000 | 1.7G | 5 | 269 |
150,000,000 | 17G | 33 | 3225 |
Select with TPCH 22 queries and table scan
- Spark JDBC uses default config without
partitionColumn, lowerBound, upperBound
to partition the table - TiSpark will partition the table for us automatically
Query | DataSize | TiSpark(s) | Spark JDBC(s) |
---|---|---|---|
TPC-H 22 queries | 1G | 131 | 157 |
TPC-H 22 queries | 10G | 424 | 1793 ( q21 OOM ) |
select * from orders | 164M | 5 | 10 |
select * from orders | 1.7G | 14 | 89 |
If you want to do a benchmark for TiSpark,here is a reference (Chinese only for now)
Machine * 2
* CPU: 48c
* Memory: 187G
TiDB v6.0.0: 3 TiKV
Spark v3.1.3: Local Mode
the first machine run 2 TiKV, the second machine run 1 TiKV and 1 spark
Load 50G TPC-DS Data to TiDB. See here for the detail of data load
Some queries are not compatible with Spark SQL
- change all the
date_add(start_date, interval 30 day)
todate_add(start_date, 30)
- change alias from 'name' to `name`
Execute 99 TPC-DS query on 50G Data
storage | total time(s) |
---|---|
TiSpark on TiKV | 7504 |
TiSpark on TiFlash | 2928 (Q5 Fail) |
data | tiflash on tidb | tiflash on tispark | env |
---|---|---|---|
1T | 1672.783 | 4673.186 | 80C 512G+2*960SSD * 9 |
3T | 1315.159 | 6162.302 | ARM 80C 512G+SSD * 6 |
5T | 1947.046 | 6162.302 | 80C 512G+2*960SSD * 10 |