-
Notifications
You must be signed in to change notification settings - Fork 244
TiSpark Benchmark
shiyuhang0 edited this page Apr 21, 2022
·
4 revisions
Machine * 10
* CPU: 8 Intel Xeon Processor (Icelake)
* Memory: 32G
* Disk: 500G
TiDB 5.4.0: 3 TiDB + 3 TiKV + 1PD (TiDB and PD are in the same machine)
Spark 3.0.3 StandAlone: 1 master + 3 worker
Parallel number depends on the total number of executor cores = 3*8 = 24
Write data from HDFS to TiDB with Data generated by TPC-H (ORDERS table)
TiSpark Write bechmark
Count(*) | Data size | Tasknumber | Time(s) |
---|---|---|---|
1,500,000 | 164M | 9 | 62 |
15,000,000 | 1.7G | 23 | 396 |
150,000,000 | 17G | 226 | 4722 |
Spark JDBC Write benchmark
Count(*) | Data size | Tasknumber | Time(s) |
---|---|---|---|
1,500,000 | 164M | 24 | 23 |
15,000,000 | 1.7G | 24 | 244 |
150,000,000 | 17G | 133 | 2483 |
Delete data from TiDB with TiSpark (ORDERS table)
Count(*) | Data size | Tasknumber | Time(s) |
---|---|---|---|
1,500,000 | 164M | 3 | 31 |
15,000,000 | 1.7G | 5 | 269 |
150,000,000 | 17G | 33 | 3225 |
Select with TPCH 22 queries and table scan
- Spark JDBC uses default config without
partitionColumn, lowerBound, upperBound
to partition the table - TiSpark will partition the table for us automatically
Query | DataSize | TiSpark(s) | Spark JDBC(s) |
---|---|---|---|
TPC-H 22 queries | 1G | 131 | 157 |
TPC-H 22 queries | 10G | 424 | 1793 ( q21 OOM ) |
select * from orders | 164M | 5 | 10 |
select * from orders | 1.7G | 14 | 89 |
If you want to do a benchmark for TiSpark,here is a reference (Chinese only for now)