Skip to content

TPCH Queries

Harish Butani edited this page Jan 6, 2022 · 2 revisions

generated Fri Jun 04 16:33:23 PDT 2021

Table of Contents

Plan Generation Config

  • generated for tpch scale 1000 instance.
spark.sql.oracle.enable.querysplitting = true
spark.sql.oracle.querysplit.target = 4MB
spark.sql.oracle.querysplit.maxfetch.rounds=0.5

Query q1

Spark SQL

select
	l_returnflag,
	l_linestatus,
	sum(l_quantity) as sum_qty,
	sum(l_extendedprice) as sum_base_price,
	sum(l_extendedprice * (1 - l_discount)) as sum_disc_price,
	sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge,
	avg(l_quantity) as avg_qty,
	avg(l_extendedprice) as avg_price,
	avg(l_discount) as avg_disc,
	count(*) as count_order
from
	lineitem
where
	l_shipdate <= date '1998-12-01' - interval '90' day
group by
	l_returnflag,
	l_linestatus
order by
	l_returnflag,
	l_linestatus

Spark Plan with Pushdown turned off

read execution

Spark Plan with Pushdown turned on, Sharding pushdown off

read execution

Project (1)
+- BatchScan (2)

(2) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select "L_RETURNFLAG", "L_LINESTATUS", SUM("L_QUANTITY") AS "sum_qty", SUM("L_EXTENDEDPRICE") AS "sum_base_price", SUM(("L_EXTENDEDPRICE" * cast((1.00000000000000000 - cast("L_DISCOUNT" as NUMBER(38, 17))) as NUMBER(38, 18)))) AS "sum_disc_price", SUM((cast(("L_EXTENDEDPRICE" * cast((1.00000000000000000 - cast("L_DISCOUNT" as NUMBER(38, 17))) as NUMBER(38, 18))) as NUMBER(38, 17)) * (1.00000000000000000 + cast("L_TAX" as NUMBER(38, 17))))) AS "sum_charge", AVG("L_QUANTITY") AS "avg_qty", AVG("L_EXTENDEDPRICE") AS "avg_price", AVG("L_DISCOUNT") AS "avg_disc", COUNT(1) AS "count_order"
from TPCH.LINEITEM 
where ("L_SHIPDATE" <= TRUNC(TIMESTAMP '1998-09-02 07:00:00.000000'))
group by "L_RETURNFLAG", "L_LINESTATUS"
order by "L_RETURNFLAG" ASC NULLS FIRST, "L_LINESTATUS" ASC NULLS FIRST
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 1, bytes=121
Sharding details:
  shard instances cost=26337266, total query cost=26337266
  shard instances time(secs)=1029, total query time(secs)=1029
  num of shard queries in plan=1, joins in coordinator=false, table scans in coordinator=false
Pushdown Oracle SQL, Query Splitting details:
Query executed on Shard Coordinator

Spark Plan with Pushdown turned on, Sharding pushdown on

read execution

Project (1)
+- BatchScan (2)

(2) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select "L_RETURNFLAG", "L_LINESTATUS", SUM("L_QUANTITY") AS "sum_qty", SUM("L_EXTENDEDPRICE") AS "sum_base_price", SUM(("L_EXTENDEDPRICE" * cast((1.00000000000000000 - cast("L_DISCOUNT" as NUMBER(38, 17))) as NUMBER(38, 18)))) AS "sum_disc_price", SUM((cast(("L_EXTENDEDPRICE" * cast((1.00000000000000000 - cast("L_DISCOUNT" as NUMBER(38, 17))) as NUMBER(38, 18))) as NUMBER(38, 17)) * (1.00000000000000000 + cast("L_TAX" as NUMBER(38, 17))))) AS "sum_charge", AVG("L_QUANTITY") AS "avg_qty", AVG("L_EXTENDEDPRICE") AS "avg_price", AVG("L_DISCOUNT") AS "avg_disc", COUNT(1) AS "count_order"
from TPCH.LINEITEM 
where ("L_SHIPDATE" <= TRUNC(TIMESTAMP '1998-09-02 07:00:00.000000'))
group by "L_RETURNFLAG", "L_LINESTATUS"
order by "L_RETURNFLAG" ASC NULLS FIRST, "L_LINESTATUS" ASC NULLS FIRST
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 1, bytes=121
Sharding details:
  shard instances cost=26337266, total query cost=26337266
  shard instances time(secs)=1029, total query time(secs)=1029
  num of shard queries in plan=1, joins in coordinator=false, table scans in coordinator=false
Pushdown Oracle SQL, Query Splitting details:
Query executed on Shard Coordinator

Query q2

Spark SQL

select
	s_acctbal,
	s_name,
	n_name,
	p_partkey,
	p_mfgr,
	s_address,
	s_phone,
	s_comment
from
	part,
	supplier,
	partsupp,
	nation,
	region
where
	p_partkey = ps_partkey
	and s_suppkey = ps_suppkey
	and p_size = 15
	and p_type like '%BRASS'
	and s_nationkey = n_nationkey
	and n_regionkey = r_regionkey
	and r_name = 'EUROPE'
	and ps_supplycost = (
		select
			min(ps_supplycost)
		from
			partsupp,
			supplier,
			nation,
			region
		where
			p_partkey = ps_partkey
			and s_suppkey = ps_suppkey
			and s_nationkey = n_nationkey
			and n_regionkey = r_regionkey
			and r_name = 'EUROPE'
	)
order by
	s_acctbal desc,
	n_name,
	s_name,
	p_partkey
limit 100

Spark Plan with Pushdown turned off

read execution

Spark Plan with Pushdown turned on, Sharding pushdown off

read execution

Project (1)
+- BatchScan (2)

(2) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select "S_ACCTBAL", "S_NAME", "N_NAME", "P_PARTKEY", "P_MFGR", "S_ADDRESS", "S_PHONE", "S_COMMENT"
from ( select "S_ACCTBAL", "S_NAME", "N_NAME", "P_PARTKEY", "P_MFGR", "S_ADDRESS", "S_PHONE", "S_COMMENT"
from TPCH.PART  join TPCH.PARTSUPP "sparkora_1" on ("P_PARTKEY" = "sparkora_1"."PS_PARTKEY") join ( select "min(ps_supplycost)", "PS_PARTKEY"
from ( select MIN("PS_SUPPLYCOST") AS "min(ps_supplycost)", "PS_PARTKEY"
from TPCH.PARTSUPP  join TPCH.SUPPLIER  on ("PS_SUPPKEY" = "S_SUPPKEY") join TPCH.NATION  on ("S_NATIONKEY" = "N_NATIONKEY") join TPCH.REGION  on ("N_REGIONKEY" = "R_REGIONKEY")
where ("R_NAME" = 'EUROPE')
group by "PS_PARTKEY" ) 
where "min(ps_supplycost)" IS NOT NULL ) "sparkora_2" on (("PS_SUPPLYCOST" = "min(ps_supplycost)") AND ("P_PARTKEY" = "sparkora_2"."PS_PARTKEY")) join TPCH.SUPPLIER  on ("PS_SUPPKEY" = "S_SUPPKEY") join TPCH.NATION  on ("S_NATIONKEY" = "N_NATIONKEY") join TPCH.REGION  on ("N_REGIONKEY" = "R_REGIONKEY")
where ((("P_SIZE" = 15.000000000000000000) AND ("P_TYPE" LIKE CONCAT('%' , 'BRASS'))) AND ("R_NAME" = 'EUROPE'))
order by "S_ACCTBAL" DESC NULLS LAST, "N_NAME" ASC NULLS FIRST, "S_NAME" ASC NULLS FIRST, "P_PARTKEY" ASC NULLS FIRST ) 
where rownum <= 100
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 100, bytes=15100
Pushdown Oracle SQL, Query Splitting details:
Replicated Query is run on instance: DataSourceKey(jdbc:oracle:thin:@//denad190:1521/sh1.us.oracle.com,tpch)
Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad190:1521/sh1.us.oracle.com,tpch) Split Strategy:
Query is not split

Spark Plan with Pushdown turned on, Sharding pushdown on

read execution

Project (1)
+- BatchScan (2)

(2) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select "S_ACCTBAL", "S_NAME", "N_NAME", "P_PARTKEY", "P_MFGR", "S_ADDRESS", "S_PHONE", "S_COMMENT"
from ( select "S_ACCTBAL", "S_NAME", "N_NAME", "P_PARTKEY", "P_MFGR", "S_ADDRESS", "S_PHONE", "S_COMMENT"
from TPCH.PART  join TPCH.PARTSUPP "sparkora_1" on ("P_PARTKEY" = "sparkora_1"."PS_PARTKEY") join ( select "min(ps_supplycost)", "PS_PARTKEY"
from ( select MIN("PS_SUPPLYCOST") AS "min(ps_supplycost)", "PS_PARTKEY"
from TPCH.PARTSUPP  join TPCH.SUPPLIER  on ("PS_SUPPKEY" = "S_SUPPKEY") join TPCH.NATION  on ("S_NATIONKEY" = "N_NATIONKEY") join TPCH.REGION  on ("N_REGIONKEY" = "R_REGIONKEY")
where ("R_NAME" = 'EUROPE')
group by "PS_PARTKEY" ) 
where "min(ps_supplycost)" IS NOT NULL ) "sparkora_2" on (("PS_SUPPLYCOST" = "min(ps_supplycost)") AND ("P_PARTKEY" = "sparkora_2"."PS_PARTKEY")) join TPCH.SUPPLIER  on ("PS_SUPPKEY" = "S_SUPPKEY") join TPCH.NATION  on ("S_NATIONKEY" = "N_NATIONKEY") join TPCH.REGION  on ("N_REGIONKEY" = "R_REGIONKEY")
where ((("P_SIZE" = 15.000000000000000000) AND ("P_TYPE" LIKE CONCAT('%' , 'BRASS'))) AND ("R_NAME" = 'EUROPE'))
order by "S_ACCTBAL" DESC NULLS LAST, "N_NAME" ASC NULLS FIRST, "S_NAME" ASC NULLS FIRST, "P_PARTKEY" ASC NULLS FIRST ) 
where rownum <= 100
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 100, bytes=15100
Pushdown Oracle SQL, Query Splitting details:
Replicated Query is run on instance: DataSourceKey(jdbc:oracle:thin:@//denad190:1521/sh1.us.oracle.com,tpch)
Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad190:1521/sh1.us.oracle.com,tpch) Split Strategy:
Query is not split

Query q3

Spark SQL

select
	l_orderkey,
	sum(l_extendedprice * (1 - l_discount)) as revenue,
	o_orderdate,
	o_shippriority
from
	customer,
	orders,
	lineitem
where
	c_mktsegment = 'BUILDING'
	and c_custkey = o_custkey
	and l_orderkey = o_orderkey
	and o_orderdate < date '1995-03-15'
	and l_shipdate > date '1995-03-15'
group by
	l_orderkey,
	o_orderdate,
	o_shippriority
order by
	revenue desc,
	o_orderdate
limit 10

Spark Plan with Pushdown turned off

read execution

Spark Plan with Pushdown turned on, Sharding pushdown off

read execution

Project (1)
+- BatchScan (2)

(2) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select "L_ORDERKEY", "revenue", "O_ORDERDATE", "O_SHIPPRIORITY"
from ( select "L_ORDERKEY", SUM(("L_EXTENDEDPRICE" * cast((1.00000000000000000 - cast("L_DISCOUNT" as NUMBER(38, 17))) as NUMBER(38, 18)))) AS "revenue", "O_ORDERDATE", "O_SHIPPRIORITY"
from TPCH.CUSTOMER  join TPCH.ORDERS  on ("C_CUSTKEY" = "O_CUSTKEY") join TPCH.LINEITEM  on ("O_ORDERKEY" = "L_ORDERKEY")
where ((("C_MKTSEGMENT" = 'BUILDING') AND ("O_ORDERDATE" < TRUNC(TIMESTAMP '1995-03-15 08:00:00.000000'))) AND ("L_SHIPDATE" > TRUNC(TIMESTAMP '1995-03-15 08:00:00.000000')))
group by "L_ORDERKEY", "O_ORDERDATE", "O_SHIPPRIORITY"
order by "revenue" DESC NULLS LAST, "O_ORDERDATE" ASC NULLS FIRST ) 
where rownum <= 10
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 1, bytes=48
Sharding details:
  shard instances cost=35075645, total query cost=35075645
  shard instances time(secs)=1371, total query time(secs)=1371
  num of shard queries in plan=1, joins in coordinator=false, table scans in coordinator=false
Pushdown Oracle SQL, Query Splitting details:
Query executed on Shard Coordinator

Spark Plan with Pushdown turned on, Sharding pushdown on

read execution

Project (1)
+- BatchScan (2)

(2) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select "L_ORDERKEY", "revenue", "O_ORDERDATE", "O_SHIPPRIORITY"
from ( select "L_ORDERKEY", SUM(("L_EXTENDEDPRICE" * cast((1.00000000000000000 - cast("L_DISCOUNT" as NUMBER(38, 17))) as NUMBER(38, 18)))) AS "revenue", "O_ORDERDATE", "O_SHIPPRIORITY"
from TPCH.CUSTOMER  join TPCH.ORDERS  on ("C_CUSTKEY" = "O_CUSTKEY") join TPCH.LINEITEM  on ("O_ORDERKEY" = "L_ORDERKEY")
where ((("C_MKTSEGMENT" = 'BUILDING') AND ("O_ORDERDATE" < TRUNC(TIMESTAMP '1995-03-15 08:00:00.000000'))) AND ("L_SHIPDATE" > TRUNC(TIMESTAMP '1995-03-15 08:00:00.000000')))
group by "L_ORDERKEY", "O_ORDERDATE", "O_SHIPPRIORITY"
order by "revenue" DESC NULLS LAST, "O_ORDERDATE" ASC NULLS FIRST ) 
where rownum <= 10
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 1, bytes=48
Sharding details:
  shard instances cost=35075645, total query cost=35075645
  shard instances time(secs)=1371, total query time(secs)=1371
  num of shard queries in plan=1, joins in coordinator=false, table scans in coordinator=false
Pushdown Oracle SQL, Query Splitting details:
Query executed on Shard Coordinator

Query q4

Spark SQL

select
	o_orderpriority,
	count(*) as order_count
from
	orders
where
	o_orderdate >= date '1993-07-01'
	and o_orderdate < date '1993-07-01' + interval '3' month
	and exists (
		select
			*
		from
			lineitem
		where
			l_orderkey = o_orderkey
			and l_commitdate < l_receiptdate
	)
group by
	o_orderpriority
order by
	o_orderpriority

Spark Plan with Pushdown turned off

read execution

Spark Plan with Pushdown turned on, Sharding pushdown off

read execution

Project (1)
+- BatchScan (2)

(2) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select "O_ORDERPRIORITY", COUNT(1) AS "order_count"
from TPCH.ORDERS 
where ((("O_ORDERDATE" >= TRUNC(TIMESTAMP '1993-07-01 07:00:00.000000')) AND ("O_ORDERDATE" < TRUNC(TIMESTAMP '1993-10-01 07:00:00.000000'))) AND  "O_ORDERKEY" IN ( select "L_ORDERKEY"
from TPCH.LINEITEM 
where ("L_COMMITDATE" < "L_RECEIPTDATE") ))
group by "O_ORDERPRIORITY"
order by "O_ORDERPRIORITY" ASC NULLS FIRST
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 1, bytes=23
Sharding details:
  shard instances cost=6324138, total query cost=6324138
  shard instances time(secs)=248, total query time(secs)=248
  num of shard queries in plan=1, joins in coordinator=false, table scans in coordinator=false
Pushdown Oracle SQL, Query Splitting details:
Query executed on Shard Coordinator

Spark Plan with Pushdown turned on, Sharding pushdown on

read execution

Project (1)
+- BatchScan (2)

(2) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select "O_ORDERPRIORITY", COUNT(1) AS "order_count"
from TPCH.ORDERS 
where ((("O_ORDERDATE" >= TRUNC(TIMESTAMP '1993-07-01 07:00:00.000000')) AND ("O_ORDERDATE" < TRUNC(TIMESTAMP '1993-10-01 07:00:00.000000'))) AND  "O_ORDERKEY" IN ( select "L_ORDERKEY"
from TPCH.LINEITEM 
where ("L_COMMITDATE" < "L_RECEIPTDATE") ))
group by "O_ORDERPRIORITY"
order by "O_ORDERPRIORITY" ASC NULLS FIRST
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 1, bytes=23
Sharding details:
  shard instances cost=6324138, total query cost=6324138
  shard instances time(secs)=248, total query time(secs)=248
  num of shard queries in plan=1, joins in coordinator=false, table scans in coordinator=false
Pushdown Oracle SQL, Query Splitting details:
Query executed on Shard Coordinator

Query q5

Spark SQL

select
	n_name,
	sum(l_extendedprice * (1 - l_discount)) as revenue
from
	customer,
	orders,
	lineitem,
	supplier,
	nation,
	region
where
	c_custkey = o_custkey
	and l_orderkey = o_orderkey
	and l_suppkey = s_suppkey
	and c_nationkey = s_nationkey
	and s_nationkey = n_nationkey
	and n_regionkey = r_regionkey
	and r_name = 'ASIA'
	and o_orderdate >= date '1994-01-01'
	and o_orderdate < date '1994-01-01' + interval '1' year
group by
	n_name
order by
	revenue desc

Spark Plan with Pushdown turned off

read execution

Spark Plan with Pushdown turned on, Sharding pushdown off

read execution

Project (1)
+- BatchScan (2)

(2) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select "N_NAME", SUM(("L_EXTENDEDPRICE" * cast((1.00000000000000000 - cast("L_DISCOUNT" as NUMBER(38, 17))) as NUMBER(38, 18)))) AS "revenue"
from TPCH.CUSTOMER  join TPCH.ORDERS  on ("C_CUSTKEY" = "O_CUSTKEY") join TPCH.LINEITEM  on ("O_ORDERKEY" = "L_ORDERKEY") join TPCH.SUPPLIER  on (("L_SUPPKEY" = "S_SUPPKEY") AND ("C_NATIONKEY" = "S_NATIONKEY")) join TPCH.NATION  on ("S_NATIONKEY" = "N_NATIONKEY") join TPCH.REGION  on ("N_REGIONKEY" = "R_REGIONKEY")
where ((("O_ORDERDATE" >= TRUNC(TIMESTAMP '1994-01-01 08:00:00.000000')) AND ("O_ORDERDATE" < TRUNC(TIMESTAMP '1995-01-01 08:00:00.000000'))) AND ("R_NAME" = 'ASIA'))
group by "N_NAME"
order by "revenue" DESC NULLS LAST
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 1, bytes=22
Sharding details:
  shard instances cost=43517888, total query cost=43517888
  shard instances time(secs)=1700, total query time(secs)=1700
  num of shard queries in plan=1, joins in coordinator=false, table scans in coordinator=false
Pushdown Oracle SQL, Query Splitting details:
Query executed on Shard Coordinator

Spark Plan with Pushdown turned on, Sharding pushdown on

read execution

Project (1)
+- BatchScan (2)

(2) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select "N_NAME", SUM(("L_EXTENDEDPRICE" * cast((1.00000000000000000 - cast("L_DISCOUNT" as NUMBER(38, 17))) as NUMBER(38, 18)))) AS "revenue"
from TPCH.CUSTOMER  join TPCH.ORDERS  on ("C_CUSTKEY" = "O_CUSTKEY") join TPCH.LINEITEM  on ("O_ORDERKEY" = "L_ORDERKEY") join TPCH.SUPPLIER  on (("L_SUPPKEY" = "S_SUPPKEY") AND ("C_NATIONKEY" = "S_NATIONKEY")) join TPCH.NATION  on ("S_NATIONKEY" = "N_NATIONKEY") join TPCH.REGION  on ("N_REGIONKEY" = "R_REGIONKEY")
where ((("O_ORDERDATE" >= TRUNC(TIMESTAMP '1994-01-01 08:00:00.000000')) AND ("O_ORDERDATE" < TRUNC(TIMESTAMP '1995-01-01 08:00:00.000000'))) AND ("R_NAME" = 'ASIA'))
group by "N_NAME"
order by "revenue" DESC NULLS LAST
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 1, bytes=22
Sharding details:
  shard instances cost=43517888, total query cost=43517888
  shard instances time(secs)=1700, total query time(secs)=1700
  num of shard queries in plan=1, joins in coordinator=false, table scans in coordinator=false
Pushdown Oracle SQL, Query Splitting details:
Query executed on Shard Coordinator

Query q6

Spark SQL

select
	sum(l_extendedprice * l_discount) as revenue
from
	lineitem
where
	l_shipdate >= date '1994-01-01'
	and l_shipdate < date '1994-01-01' + interval '1' year
	and l_discount between .06 - 0.01 and .06 + 0.01
	and l_quantity < 24

Spark Plan with Pushdown turned off

read execution

Spark Plan with Pushdown turned on, Sharding pushdown off

read execution

Project (1)
+- BatchScan (2)

(2) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select SUM(("L_EXTENDEDPRICE" * "L_DISCOUNT")) AS "revenue"
from TPCH.LINEITEM 
where ((((("L_SHIPDATE" >= TRUNC(TIMESTAMP '1994-01-01 08:00:00.000000')) AND ("L_SHIPDATE" < TRUNC(TIMESTAMP '1995-01-01 08:00:00.000000'))) AND ("L_DISCOUNT" >= 0.050000000000000000)) AND ("L_DISCOUNT" <= 0.070000000000000000)) AND ("L_QUANTITY" < 24.000000000000000000))
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 1, bytes=13
Sharding details:
  shard instances cost=26327610, total query cost=26327610
  shard instances time(secs)=1029, total query time(secs)=1029
  num of shard queries in plan=1, joins in coordinator=false, table scans in coordinator=false
Pushdown Oracle SQL, Query Splitting details:
Query executed on Shard Coordinator

Spark Plan with Pushdown turned on, Sharding pushdown on

read execution

Project (1)
+- BatchScan (2)

(2) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select SUM(("L_EXTENDEDPRICE" * "L_DISCOUNT")) AS "revenue"
from TPCH.LINEITEM 
where ((((("L_SHIPDATE" >= TRUNC(TIMESTAMP '1994-01-01 08:00:00.000000')) AND ("L_SHIPDATE" < TRUNC(TIMESTAMP '1995-01-01 08:00:00.000000'))) AND ("L_DISCOUNT" >= 0.050000000000000000)) AND ("L_DISCOUNT" <= 0.070000000000000000)) AND ("L_QUANTITY" < 24.000000000000000000))
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 1, bytes=13
Sharding details:
  shard instances cost=26327610, total query cost=26327610
  shard instances time(secs)=1029, total query time(secs)=1029
  num of shard queries in plan=1, joins in coordinator=false, table scans in coordinator=false
Pushdown Oracle SQL, Query Splitting details:
Query executed on Shard Coordinator

Query q7

Spark SQL

select
	supp_nation,
	cust_nation,
	l_year,
	sum(volume) as revenue
from
	(
		select
			n1.n_name as supp_nation,
			n2.n_name as cust_nation,
			year(l_shipdate) as l_year,
			l_extendedprice * (1 - l_discount) as volume
		from
			supplier,
			lineitem,
			orders,
			customer,
			nation n1,
			nation n2
		where
			s_suppkey = l_suppkey
			and o_orderkey = l_orderkey
			and c_custkey = o_custkey
			and s_nationkey = n1.n_nationkey
			and c_nationkey = n2.n_nationkey
			and (
				(n1.n_name = 'FRANCE' and n2.n_name = 'GERMANY')
				or (n1.n_name = 'GERMANY' and n2.n_name = 'FRANCE')
			)
			and l_shipdate between date '1995-01-01' and date '1996-12-31'
	) as shipping
group by
	supp_nation,
	cust_nation,
	l_year
order by
	supp_nation,
	cust_nation,
	l_year

Spark Plan with Pushdown turned off

read execution

Spark Plan with Pushdown turned on, Sharding pushdown off

read execution

Project (1)
+- BatchScan (2)

(2) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select "supp_nation", "cust_nation", "l_year", SUM("volume") AS "revenue"
from ( select "sparkora_4"."N_NAME" AS "supp_nation", "sparkora_5"."N_NAME" AS "cust_nation",  extract(YEAR from "L_SHIPDATE")  AS "l_year", ("L_EXTENDEDPRICE" * cast((1.00000000000000000 - cast("L_DISCOUNT" as NUMBER(38, 17))) as NUMBER(38, 18))) AS "volume"
from TPCH.SUPPLIER  join TPCH.LINEITEM  on ("S_SUPPKEY" = "L_SUPPKEY") join TPCH.ORDERS  on ("L_ORDERKEY" = "O_ORDERKEY") join TPCH.CUSTOMER  on ("O_CUSTKEY" = "C_CUSTKEY") join TPCH.NATION "sparkora_4" on ("S_NATIONKEY" = "sparkora_4"."N_NATIONKEY") join ( select "N_NATIONKEY", "N_NAME"
from TPCH.NATION 
where (("N_NAME" = 'GERMANY') OR ("N_NAME" = 'FRANCE')) ) "sparkora_5" on (("C_NATIONKEY" = "sparkora_5"."N_NATIONKEY") AND ((("sparkora_4"."N_NAME" = 'FRANCE') AND ("sparkora_5"."N_NAME" = 'GERMANY')) OR (("sparkora_4"."N_NAME" = 'GERMANY') AND ("sparkora_5"."N_NAME" = 'FRANCE'))))
where ((("L_SHIPDATE" >= TRUNC(TIMESTAMP '1995-01-01 08:00:00.000000')) AND ("L_SHIPDATE" <= TRUNC(TIMESTAMP '1996-12-31 08:00:00.000000'))) AND (("sparkora_4"."N_NAME" = 'FRANCE') OR ("sparkora_4"."N_NAME" = 'GERMANY'))) ) 
group by "supp_nation", "cust_nation", "l_year"
order by "supp_nation" ASC NULLS FIRST, "cust_nation" ASC NULLS FIRST, "l_year" ASC NULLS FIRST
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 1, bytes=39
Sharding details:
  shard instances cost=35823095, total query cost=35823095
  shard instances time(secs)=1400, total query time(secs)=1400
  num of shard queries in plan=1, joins in coordinator=false, table scans in coordinator=false
Pushdown Oracle SQL, Query Splitting details:
Query executed on Shard Coordinator

Spark Plan with Pushdown turned on, Sharding pushdown on

read execution

Project (1)
+- BatchScan (2)

(2) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select "supp_nation", "cust_nation", "l_year", SUM("volume") AS "revenue"
from ( select "sparkora_4"."N_NAME" AS "supp_nation", "sparkora_5"."N_NAME" AS "cust_nation",  extract(YEAR from "L_SHIPDATE")  AS "l_year", ("L_EXTENDEDPRICE" * cast((1.00000000000000000 - cast("L_DISCOUNT" as NUMBER(38, 17))) as NUMBER(38, 18))) AS "volume"
from TPCH.SUPPLIER  join TPCH.LINEITEM  on ("S_SUPPKEY" = "L_SUPPKEY") join TPCH.ORDERS  on ("L_ORDERKEY" = "O_ORDERKEY") join TPCH.CUSTOMER  on ("O_CUSTKEY" = "C_CUSTKEY") join TPCH.NATION "sparkora_4" on ("S_NATIONKEY" = "sparkora_4"."N_NATIONKEY") join ( select "N_NATIONKEY", "N_NAME"
from TPCH.NATION 
where (("N_NAME" = 'GERMANY') OR ("N_NAME" = 'FRANCE')) ) "sparkora_5" on (("C_NATIONKEY" = "sparkora_5"."N_NATIONKEY") AND ((("sparkora_4"."N_NAME" = 'FRANCE') AND ("sparkora_5"."N_NAME" = 'GERMANY')) OR (("sparkora_4"."N_NAME" = 'GERMANY') AND ("sparkora_5"."N_NAME" = 'FRANCE'))))
where ((("L_SHIPDATE" >= TRUNC(TIMESTAMP '1995-01-01 08:00:00.000000')) AND ("L_SHIPDATE" <= TRUNC(TIMESTAMP '1996-12-31 08:00:00.000000'))) AND (("sparkora_4"."N_NAME" = 'FRANCE') OR ("sparkora_4"."N_NAME" = 'GERMANY'))) ) 
group by "supp_nation", "cust_nation", "l_year"
order by "supp_nation" ASC NULLS FIRST, "cust_nation" ASC NULLS FIRST, "l_year" ASC NULLS FIRST
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 1, bytes=39
Sharding details:
  shard instances cost=35823095, total query cost=35823095
  shard instances time(secs)=1400, total query time(secs)=1400
  num of shard queries in plan=1, joins in coordinator=false, table scans in coordinator=false
Pushdown Oracle SQL, Query Splitting details:
Query executed on Shard Coordinator

Query q8

Spark SQL

select
	o_year,
	sum(case
		when nation = 'BRAZIL' then volume
		else 0
	end) / sum(volume) as mkt_share
from
	(
		select
			year(o_orderdate) as o_year,
			l_extendedprice * (1 - l_discount) as volume,
			n2.n_name as nation
		from
			part,
			supplier,
			lineitem,
			orders,
			customer,
			nation n1,
			nation n2,
			region
		where
			p_partkey = l_partkey
			and s_suppkey = l_suppkey
			and l_orderkey = o_orderkey
			and o_custkey = c_custkey
			and c_nationkey = n1.n_nationkey
			and n1.n_regionkey = r_regionkey
			and r_name = 'AMERICA'
			and s_nationkey = n2.n_nationkey
			and o_orderdate between date '1995-01-01' and date '1996-12-31'
			and p_type = 'ECONOMY ANODIZED STEEL'
	) as all_nations
group by
	o_year
order by
	o_year

Spark Plan with Pushdown turned off

read execution

Spark Plan with Pushdown turned on, Sharding pushdown off

read execution

Project (1)
+- BatchScan (2)

(2) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select "o_year", (SUM(CASE WHEN ("nation" = 'BRAZIL') THEN "volume" ELSE 0.000000 END) / SUM("volume")) AS "mkt_share"
from ( select  extract(YEAR from "O_ORDERDATE")  AS "o_year", ("L_EXTENDEDPRICE" * cast((1.00000000000000000 - cast("L_DISCOUNT" as NUMBER(38, 17))) as NUMBER(38, 18))) AS "volume", "sparkora_6"."N_NAME" AS "nation"
from TPCH.PART  join TPCH.LINEITEM  on ("P_PARTKEY" = "L_PARTKEY") join TPCH.SUPPLIER  on ("L_SUPPKEY" = "S_SUPPKEY") join TPCH.ORDERS  on ("L_ORDERKEY" = "O_ORDERKEY") join TPCH.CUSTOMER  on ("O_CUSTKEY" = "C_CUSTKEY") join TPCH.NATION "sparkora_5" on ("C_NATIONKEY" = "sparkora_5"."N_NATIONKEY") join TPCH.NATION "sparkora_6" on ("S_NATIONKEY" = "sparkora_6"."N_NATIONKEY") join TPCH.REGION  on ("sparkora_5"."N_REGIONKEY" = "R_REGIONKEY")
where ((("P_TYPE" = 'ECONOMY ANODIZED STEEL') AND (("O_ORDERDATE" >= TRUNC(TIMESTAMP '1995-01-01 08:00:00.000000')) AND ("O_ORDERDATE" <= TRUNC(TIMESTAMP '1996-12-31 08:00:00.000000')))) AND ("R_NAME" = 'AMERICA')) ) 
group by "o_year"
order by "o_year" ASC NULLS FIRST
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 1, bytes=34
Sharding details:
  shard instances cost=45613820, total query cost=45613820
  shard instances time(secs)=1782, total query time(secs)=1782
  num of shard queries in plan=1, joins in coordinator=false, table scans in coordinator=false
Pushdown Oracle SQL, Query Splitting details:
Query executed on Shard Coordinator

Spark Plan with Pushdown turned on, Sharding pushdown on

read execution

Project (1)
+- BatchScan (2)

(2) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select "o_year", (SUM(CASE WHEN ("nation" = 'BRAZIL') THEN "volume" ELSE 0.000000 END) / SUM("volume")) AS "mkt_share"
from ( select  extract(YEAR from "O_ORDERDATE")  AS "o_year", ("L_EXTENDEDPRICE" * cast((1.00000000000000000 - cast("L_DISCOUNT" as NUMBER(38, 17))) as NUMBER(38, 18))) AS "volume", "sparkora_6"."N_NAME" AS "nation"
from TPCH.PART  join TPCH.LINEITEM  on ("P_PARTKEY" = "L_PARTKEY") join TPCH.SUPPLIER  on ("L_SUPPKEY" = "S_SUPPKEY") join TPCH.ORDERS  on ("L_ORDERKEY" = "O_ORDERKEY") join TPCH.CUSTOMER  on ("O_CUSTKEY" = "C_CUSTKEY") join TPCH.NATION "sparkora_5" on ("C_NATIONKEY" = "sparkora_5"."N_NATIONKEY") join TPCH.NATION "sparkora_6" on ("S_NATIONKEY" = "sparkora_6"."N_NATIONKEY") join TPCH.REGION  on ("sparkora_5"."N_REGIONKEY" = "R_REGIONKEY")
where ((("P_TYPE" = 'ECONOMY ANODIZED STEEL') AND (("O_ORDERDATE" >= TRUNC(TIMESTAMP '1995-01-01 08:00:00.000000')) AND ("O_ORDERDATE" <= TRUNC(TIMESTAMP '1996-12-31 08:00:00.000000')))) AND ("R_NAME" = 'AMERICA')) ) 
group by "o_year"
order by "o_year" ASC NULLS FIRST
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 1, bytes=34
Sharding details:
  shard instances cost=45613820, total query cost=45613820
  shard instances time(secs)=1782, total query time(secs)=1782
  num of shard queries in plan=1, joins in coordinator=false, table scans in coordinator=false
Pushdown Oracle SQL, Query Splitting details:
Query executed on Shard Coordinator

Query q9

Spark SQL

select
	nation,
	o_year,
	sum(amount) as sum_profit
from
	(
		select
			n_name as nation,
			year(o_orderdate) as o_year,
			l_extendedprice * (1 - l_discount) - ps_supplycost * l_quantity as amount
		from
			part,
			supplier,
			lineitem,
			partsupp,
			orders,
			nation
		where
			s_suppkey = l_suppkey
			and ps_suppkey = l_suppkey
			and ps_partkey = l_partkey
			and p_partkey = l_partkey
			and o_orderkey = l_orderkey
			and s_nationkey = n_nationkey
			and p_name like '%green%'
	) as profit
group by
	nation,
	o_year
order by
	nation,
	o_year desc

Spark Plan with Pushdown turned off

read execution

Spark Plan with Pushdown turned on, Sharding pushdown off

read execution

Project (1)
+- BatchScan (2)

(2) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select "nation", "o_year", SUM("amount") AS "sum_profit"
from ( select "N_NAME" AS "nation",  extract(YEAR from "O_ORDERDATE")  AS "o_year", (("L_EXTENDEDPRICE" * cast((1.00000000000000000 - cast("L_DISCOUNT" as NUMBER(38, 17))) as NUMBER(38, 18))) - ("PS_SUPPLYCOST" * "L_QUANTITY")) AS "amount"
from TPCH.PART  join TPCH.LINEITEM  on ("P_PARTKEY" = "L_PARTKEY") join TPCH.SUPPLIER  on ("L_SUPPKEY" = "S_SUPPKEY") join TPCH.PARTSUPP  on (("L_SUPPKEY" = "PS_SUPPKEY") AND ("L_PARTKEY" = "PS_PARTKEY")) join TPCH.ORDERS  on ("L_ORDERKEY" = "O_ORDERKEY") join TPCH.NATION  on ("S_NATIONKEY" = "N_NATIONKEY")
where  ("P_NAME" LIKE '%green%')  ) 
group by "nation", "o_year"
order by "nation" ASC NULLS FIRST, "o_year" DESC NULLS LAST
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 1, bytes=30
Sharding details:
  shard instances cost=61132960, total query cost=61132960
  shard instances time(secs)=2389, total query time(secs)=2389
  num of shard queries in plan=1, joins in coordinator=false, table scans in coordinator=false
Pushdown Oracle SQL, Query Splitting details:
Query executed on Shard Coordinator

Spark Plan with Pushdown turned on, Sharding pushdown on

read execution

Project (1)
+- BatchScan (2)

(2) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select "nation", "o_year", SUM("amount") AS "sum_profit"
from ( select "N_NAME" AS "nation",  extract(YEAR from "O_ORDERDATE")  AS "o_year", (("L_EXTENDEDPRICE" * cast((1.00000000000000000 - cast("L_DISCOUNT" as NUMBER(38, 17))) as NUMBER(38, 18))) - ("PS_SUPPLYCOST" * "L_QUANTITY")) AS "amount"
from TPCH.PART  join TPCH.LINEITEM  on ("P_PARTKEY" = "L_PARTKEY") join TPCH.SUPPLIER  on ("L_SUPPKEY" = "S_SUPPKEY") join TPCH.PARTSUPP  on (("L_SUPPKEY" = "PS_SUPPKEY") AND ("L_PARTKEY" = "PS_PARTKEY")) join TPCH.ORDERS  on ("L_ORDERKEY" = "O_ORDERKEY") join TPCH.NATION  on ("S_NATIONKEY" = "N_NATIONKEY")
where  ("P_NAME" LIKE '%green%')  ) 
group by "nation", "o_year"
order by "nation" ASC NULLS FIRST, "o_year" DESC NULLS LAST
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 1, bytes=30
Sharding details:
  shard instances cost=61132960, total query cost=61132960
  shard instances time(secs)=2389, total query time(secs)=2389
  num of shard queries in plan=1, joins in coordinator=false, table scans in coordinator=false
Pushdown Oracle SQL, Query Splitting details:
Query executed on Shard Coordinator

Query q10

Spark SQL

select
	c_custkey,
	c_name,
	sum(l_extendedprice * (1 - l_discount)) as revenue,
	c_acctbal,
	n_name,
	c_address,
	c_phone,
	c_comment
from
	customer,
	orders,
	lineitem,
	nation
where
	c_custkey = o_custkey
	and l_orderkey = o_orderkey
	and o_orderdate >= date '1993-10-01'
	and o_orderdate < date '1993-10-01' + interval '3' month
	and l_returnflag = 'R'
	and c_nationkey = n_nationkey
group by
	c_custkey,
	c_name,
	c_acctbal,
	c_phone,
	n_name,
	c_address,
	c_comment
order by
	revenue desc
limit 20

Spark Plan with Pushdown turned off

read execution

Spark Plan with Pushdown turned on, Sharding pushdown off

read execution

Project (1)
+- BatchScan (2)

(2) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select "C_CUSTKEY", "C_NAME", "revenue", "C_ACCTBAL", "N_NAME", "C_ADDRESS", "C_PHONE", "C_COMMENT"
from ( select "C_CUSTKEY", "C_NAME", SUM(("L_EXTENDEDPRICE" * cast((1.00000000000000000 - cast("L_DISCOUNT" as NUMBER(38, 17))) as NUMBER(38, 18)))) AS "revenue", "C_ACCTBAL", "N_NAME", "C_ADDRESS", "C_PHONE", "C_COMMENT"
from TPCH.CUSTOMER  join TPCH.ORDERS  on ("C_CUSTKEY" = "O_CUSTKEY") join TPCH.LINEITEM  on ("O_ORDERKEY" = "L_ORDERKEY") join TPCH.NATION  on ("C_NATIONKEY" = "N_NATIONKEY")
where ((("O_ORDERDATE" >= TRUNC(TIMESTAMP '1993-10-01 07:00:00.000000')) AND ("O_ORDERDATE" < TRUNC(TIMESTAMP '1994-01-01 08:00:00.000000'))) AND ("L_RETURNFLAG" = 'R'))
group by "C_CUSTKEY", "C_NAME", "C_ACCTBAL", "C_PHONE", "N_NAME", "C_ADDRESS", "C_COMMENT"
order by "revenue" DESC NULLS LAST ) 
where rownum <= 20
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 1, bytes=158
Sharding details:
  shard instances cost=35544298, total query cost=36478267
  shard instances time(secs)=1389, total query time(secs)=1425
  num of shard queries in plan=1, joins in coordinator=true, table scans in coordinator=true
Pushdown Oracle SQL, Query Splitting details:
Query executed on Shard Coordinator

Spark Plan with Pushdown turned on, Sharding pushdown on

read execution

TakeOrderedAndProject (1)
+- HashAggregate (2)
   +- HashAggregate (3)
      +- Project (4)
         +- BatchScan (5)

(5) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select "C_CUSTKEY", "C_NAME", "C_ADDRESS", "C_PHONE", "C_ACCTBAL", "C_COMMENT", "L_EXTENDEDPRICE", "L_DISCOUNT", "N_NAME"
from TPCH.CUSTOMER  join TPCH.ORDERS  on ("C_CUSTKEY" = "O_CUSTKEY") join TPCH.LINEITEM  on ("O_ORDERKEY" = "L_ORDERKEY") join TPCH.NATION  on ("C_NATIONKEY" = "N_NATIONKEY")
where ((("O_ORDERDATE" >= TRUNC(TIMESTAMP '1993-10-01 07:00:00.000000')) AND ("O_ORDERDATE" < TRUNC(TIMESTAMP '1994-01-01 08:00:00.000000'))) AND ("L_RETURNFLAG" = 'R'))
Pushdown Oracle SQL, Query Splitting details:
Sharded Query is run on 3 instances
Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad190:1521/sh1.us.oracle.com,tpch) Split Strategy:
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 2071935, bytes=449609895
split target candidates:
name=ORDERS, row_count=1250119, bytes=36253451, partitions=(1, 90)
name=LINEITEM, row_count=2, bytes=50
Query split by partitions

Target table:
name=ORDERS, row_count=1250119, bytes=36253451, partitions=(1, 90)
Splits:
partitions = ORDERS_P1

Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad191:1521/sh2.us.oracle.com,tpch) Split Strategy:
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 2091752, bytes=453910184
split target candidates:
name=ORDERS, row_count=1249879, bytes=36246491, partitions=(1, 90)
name=LINEITEM, row_count=2, bytes=50
Query split by partitions

Target table:
name=ORDERS, row_count=1249879, bytes=36246491, partitions=(1, 90)
Splits:
partitions = ORDERS_P1

Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad192:1521/sh3.us.oracle.com,tpch) Split Strategy:
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 2043629, bytes=443467493
split target candidates:
name=ORDERS, row_count=1250002, bytes=36250058, partitions=(1, 90)
name=LINEITEM, row_count=2, bytes=50
Query split by partitions

Target table:
name=ORDERS, row_count=1250002, bytes=36250058, partitions=(1, 90)
Splits:
partitions = ORDERS_P1

Query q11

Spark SQL

select
	ps_partkey,
	sum(ps_supplycost * ps_availqty) as value
from
	partsupp,
	supplier,
	nation
where
	ps_suppkey = s_suppkey
	and s_nationkey = n_nationkey
	and n_name = 'GERMANY'
group by
	ps_partkey having
		sum(ps_supplycost * ps_availqty) > (
			select
				sum(ps_supplycost * ps_availqty) * 0.0001000000
			from
				partsupp,
				supplier,
				nation
			where
				ps_suppkey = s_suppkey
				and s_nationkey = n_nationkey
				and n_name = 'GERMANY'
		)
order by
	value desc

Spark Plan with Pushdown turned off

read execution

Spark Plan with Pushdown turned on, Sharding pushdown off

read execution

Project (1)
+- BatchScan (2)

(2) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select "PS_PARTKEY", "value"
from ( select "PS_PARTKEY", SUM(("PS_SUPPLYCOST" * "PS_AVAILQTY")) AS "value", SUM(("PS_SUPPLYCOST" * "PS_AVAILQTY")) AS "1_sparkora"
from TPCH.PARTSUPP  join TPCH.SUPPLIER  on ("PS_SUPPKEY" = "S_SUPPKEY") join TPCH.NATION  on ("S_NATIONKEY" = "N_NATIONKEY")
where ("N_NAME" = 'GERMANY')
group by "PS_PARTKEY" ) 
where ("1_sparkora" IS NOT NULL AND ("1_sparkora" >  ( select (cast(SUM(("PS_SUPPLYCOST" * "PS_AVAILQTY")) as NUMBER(38, 10)) * 0.0001000000) AS "1_sparkora"
from TPCH.PARTSUPP  join TPCH.SUPPLIER  on ("PS_SUPPKEY" = "S_SUPPKEY") join TPCH.NATION  on ("S_NATIONKEY" = "N_NATIONKEY")
where ("N_NAME" = 'GERMANY') )))
order by "value" DESC NULLS LAST
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 1600000, bytes=68800000
Pushdown Oracle SQL, Query Splitting details:
Replicated Query is run on instance: DataSourceKey(jdbc:oracle:thin:@//denad190:1521/sh1.us.oracle.com,tpch)
Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad190:1521/sh1.us.oracle.com,tpch) Split Strategy:
Query split by result rows
scn: 109978417

Splits:
offset = 0, numRows=400000
offset = 400000, numRows=400000
offset = 800000, numRows=400000
offset = 1200000, numRows=-1

Spark Plan with Pushdown turned on, Sharding pushdown on

read execution

Project (1)
+- BatchScan (2)

(2) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select "PS_PARTKEY", "value"
from ( select "PS_PARTKEY", SUM(("PS_SUPPLYCOST" * "PS_AVAILQTY")) AS "value", SUM(("PS_SUPPLYCOST" * "PS_AVAILQTY")) AS "1_sparkora"
from TPCH.PARTSUPP  join TPCH.SUPPLIER  on ("PS_SUPPKEY" = "S_SUPPKEY") join TPCH.NATION  on ("S_NATIONKEY" = "N_NATIONKEY")
where ("N_NAME" = 'GERMANY')
group by "PS_PARTKEY" ) 
where ("1_sparkora" IS NOT NULL AND ("1_sparkora" >  ( select (cast(SUM(("PS_SUPPLYCOST" * "PS_AVAILQTY")) as NUMBER(38, 10)) * 0.0001000000) AS "1_sparkora"
from TPCH.PARTSUPP  join TPCH.SUPPLIER  on ("PS_SUPPKEY" = "S_SUPPKEY") join TPCH.NATION  on ("S_NATIONKEY" = "N_NATIONKEY")
where ("N_NAME" = 'GERMANY') )))
order by "value" DESC NULLS LAST
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 1600000, bytes=68800000
Pushdown Oracle SQL, Query Splitting details:
Replicated Query is run on instance: DataSourceKey(jdbc:oracle:thin:@//denad190:1521/sh1.us.oracle.com,tpch)
Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad190:1521/sh1.us.oracle.com,tpch) Split Strategy:
Query split by result rows
scn: 109978420

Splits:
offset = 0, numRows=400000
offset = 400000, numRows=400000
offset = 800000, numRows=400000
offset = 1200000, numRows=-1

Query q12

Spark SQL

select
	l_shipmode,
	sum(case
		when o_orderpriority = '1-URGENT'
			or o_orderpriority = '2-HIGH'
			then 1
		else 0
	end) as high_line_count,
	sum(case
		when o_orderpriority <> '1-URGENT'
			and o_orderpriority <> '2-HIGH'
			then 1
		else 0
	end) as low_line_count
from
	orders,
	lineitem
where
	o_orderkey = l_orderkey
	and l_shipmode in ('MAIL', 'SHIP')
	and l_commitdate < l_receiptdate
	and l_shipdate < l_commitdate
	and l_receiptdate >= date '1994-01-01'
	and l_receiptdate < date '1994-01-01' + interval '1' year
group by
	l_shipmode
order by
	l_shipmode

Spark Plan with Pushdown turned off

read execution

Spark Plan with Pushdown turned on, Sharding pushdown off

read execution

Project (1)
+- BatchScan (2)

(2) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select "L_SHIPMODE", SUM(CASE WHEN (("O_ORDERPRIORITY" = '1-URGENT') OR ("O_ORDERPRIORITY" = '2-HIGH')) THEN 1 ELSE 0 END) AS "high_line_count", SUM(CASE WHEN (NOT("O_ORDERPRIORITY" = '1-URGENT') AND NOT("O_ORDERPRIORITY" = '2-HIGH')) THEN 1 ELSE 0 END) AS "low_line_count"
from TPCH.ORDERS  join TPCH.LINEITEM  on ("O_ORDERKEY" = "L_ORDERKEY")
where ((((("L_COMMITDATE" < "L_RECEIPTDATE") AND ("L_SHIPDATE" < "L_COMMITDATE")) AND "L_SHIPMODE" IN ( 'MAIL', 'SHIP' )) AND ("L_RECEIPTDATE" >= TRUNC(TIMESTAMP '1994-01-01 08:00:00.000000'))) AND ("L_RECEIPTDATE" < TRUNC(TIMESTAMP '1995-01-01 08:00:00.000000')))
group by "L_SHIPMODE"
order by "L_SHIPMODE" ASC NULLS FIRST
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 1, bytes=32
Sharding details:
  shard instances cost=32667005, total query cost=32667005
  shard instances time(secs)=1277, total query time(secs)=1277
  num of shard queries in plan=1, joins in coordinator=false, table scans in coordinator=false
Pushdown Oracle SQL, Query Splitting details:
Query executed on Shard Coordinator

Spark Plan with Pushdown turned on, Sharding pushdown on

read execution

Project (1)
+- BatchScan (2)

(2) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select "L_SHIPMODE", SUM(CASE WHEN (("O_ORDERPRIORITY" = '1-URGENT') OR ("O_ORDERPRIORITY" = '2-HIGH')) THEN 1 ELSE 0 END) AS "high_line_count", SUM(CASE WHEN (NOT("O_ORDERPRIORITY" = '1-URGENT') AND NOT("O_ORDERPRIORITY" = '2-HIGH')) THEN 1 ELSE 0 END) AS "low_line_count"
from TPCH.ORDERS  join TPCH.LINEITEM  on ("O_ORDERKEY" = "L_ORDERKEY")
where ((((("L_COMMITDATE" < "L_RECEIPTDATE") AND ("L_SHIPDATE" < "L_COMMITDATE")) AND "L_SHIPMODE" IN ( 'MAIL', 'SHIP' )) AND ("L_RECEIPTDATE" >= TRUNC(TIMESTAMP '1994-01-01 08:00:00.000000'))) AND ("L_RECEIPTDATE" < TRUNC(TIMESTAMP '1995-01-01 08:00:00.000000')))
group by "L_SHIPMODE"
order by "L_SHIPMODE" ASC NULLS FIRST
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 1, bytes=32
Sharding details:
  shard instances cost=32667005, total query cost=32667005
  shard instances time(secs)=1277, total query time(secs)=1277
  num of shard queries in plan=1, joins in coordinator=false, table scans in coordinator=false
Pushdown Oracle SQL, Query Splitting details:
Query executed on Shard Coordinator

Query q13

Spark SQL

select
	c_count,
	count(*) as custdist
from
	(
		select
			c_custkey,
			count(o_orderkey) as c_count
		from
			customer left outer join orders on
				c_custkey = o_custkey
				and o_comment not like '%special%requests%'
		group by
			c_custkey
	) as c_orders
group by
	c_count
order by
	custdist desc,
	c_count desc

Spark Plan with Pushdown turned off

read execution

Spark Plan with Pushdown turned on, Sharding pushdown off

read execution

Project (1)
+- BatchScan (2)

(2) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select "c_count", COUNT(1) AS "custdist"
from ( select COUNT("O_ORDERKEY") AS "c_count"
from TPCH.CUSTOMER  left outer join ( select "O_ORDERKEY", "O_CUSTKEY"
from TPCH.ORDERS 
where  ("O_COMMENT" NOT LIKE '%special%requests%' ESCAPE '\')  )  on ("C_CUSTKEY" = "O_CUSTKEY")
group by "C_CUSTKEY" ) 
group by "c_count"
order by "custdist" DESC NULLS LAST, "c_count" DESC NULLS LAST
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 75000000, bytes=975000000
Sharding details:
  shard instances cost=6329073, total query cost=7089987
  shard instances time(secs)=248, total query time(secs)=277
  num of shard queries in plan=1, joins in coordinator=true, table scans in coordinator=true
Pushdown Oracle SQL, Query Splitting details:
Query executed on Shard Coordinator

Spark Plan with Pushdown turned on, Sharding pushdown on

read execution

Sort (1)
+- HashAggregate (2)
   +- HashAggregate (3)
      +- HashAggregate (4)
         +- HashAggregate (5)
            +- Project (6)
               +- BatchScan (7)

(7) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select "C_CUSTKEY", "O_ORDERKEY"
from TPCH.CUSTOMER  left outer join ( select "O_ORDERKEY", "O_CUSTKEY"
from TPCH.ORDERS 
where  ("O_COMMENT" NOT LIKE '%special%requests%' ESCAPE '\')  )  on ("C_CUSTKEY" = "O_CUSTKEY")
Pushdown Oracle SQL, Query Splitting details:
Sharded Query is run on 3 instances
Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad190:1521/sh1.us.oracle.com,tpch) Split Strategy:
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 152438628, bytes=10823142588
Query split by result rows
scn: 109978429

Splits:
offset = 0, numRows=38109657
offset = 38109657, numRows=38109657
offset = 76219314, numRows=38109657
offset = 114328971, numRows=-1

Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad191:1521/sh2.us.oracle.com,tpch) Split Strategy:
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 152426874, bytes=10822308054
Query split by result rows
scn: 109978220

Splits:
offset = 0, numRows=38106719
offset = 38106719, numRows=38106719
offset = 76213438, numRows=38106719
offset = 114320157, numRows=-1

Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad192:1521/sh3.us.oracle.com,tpch) Split Strategy:
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 152431197, bytes=10822614987
Query split by result rows
scn: 109978611

Splits:
offset = 0, numRows=38107800
offset = 38107800, numRows=38107800
offset = 76215600, numRows=38107800
offset = 114323400, numRows=-1

Query q14

Spark SQL

select
	100.00 * sum(case
		when p_type like 'PROMO%'
			then l_extendedprice * (1 - l_discount)
		else 0
	end) / sum(l_extendedprice * (1 - l_discount)) as promo_revenue
from
	lineitem,
	part
where
	l_partkey = p_partkey
	and l_shipdate >= date '1995-09-01'
	and l_shipdate < date '1995-09-01' + interval '1' month

Spark Plan with Pushdown turned off

read execution

Spark Plan with Pushdown turned on, Sharding pushdown off

read execution

Project (1)
+- BatchScan (2)

(2) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select ((100.000000 * SUM(CASE WHEN ("P_TYPE" LIKE CONCAT('PROMO' , '%')) THEN ("L_EXTENDEDPRICE" * cast((1.00000000000000000 - cast("L_DISCOUNT" as NUMBER(38, 17))) as NUMBER(38, 18))) ELSE 0.000000 END)) / SUM(("L_EXTENDEDPRICE" * cast((1.00000000000000000 - cast("L_DISCOUNT" as NUMBER(38, 17))) as NUMBER(38, 18))))) AS "promo_revenue"
from TPCH.LINEITEM  join TPCH.PART  on ("L_PARTKEY" = "P_PARTKEY")
where (("L_SHIPDATE" >= TRUNC(TIMESTAMP '1995-09-01 07:00:00.000000')) AND ("L_SHIPDATE" < TRUNC(TIMESTAMP '1995-10-01 07:00:00.000000')))
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 1, bytes=26
Sharding details:
  shard instances cost=27665041, total query cost=27665041
  shard instances time(secs)=1081, total query time(secs)=1081
  num of shard queries in plan=1, joins in coordinator=false, table scans in coordinator=false
Pushdown Oracle SQL, Query Splitting details:
Query executed on Shard Coordinator

Spark Plan with Pushdown turned on, Sharding pushdown on

read execution

Project (1)
+- BatchScan (2)

(2) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select ((100.000000 * SUM(CASE WHEN ("P_TYPE" LIKE CONCAT('PROMO' , '%')) THEN ("L_EXTENDEDPRICE" * cast((1.00000000000000000 - cast("L_DISCOUNT" as NUMBER(38, 17))) as NUMBER(38, 18))) ELSE 0.000000 END)) / SUM(("L_EXTENDEDPRICE" * cast((1.00000000000000000 - cast("L_DISCOUNT" as NUMBER(38, 17))) as NUMBER(38, 18))))) AS "promo_revenue"
from TPCH.LINEITEM  join TPCH.PART  on ("L_PARTKEY" = "P_PARTKEY")
where (("L_SHIPDATE" >= TRUNC(TIMESTAMP '1995-09-01 07:00:00.000000')) AND ("L_SHIPDATE" < TRUNC(TIMESTAMP '1995-10-01 07:00:00.000000')))
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 1, bytes=26
Sharding details:
  shard instances cost=27665041, total query cost=27665041
  shard instances time(secs)=1081, total query time(secs)=1081
  num of shard queries in plan=1, joins in coordinator=false, table scans in coordinator=false
Pushdown Oracle SQL, Query Splitting details:
Query executed on Shard Coordinator

Query q15

Spark SQL

with revenue0 as
	(select
		l_suppkey as supplier_no,
		sum(l_extendedprice * (1 - l_discount)) as total_revenue
	from
		lineitem
	where
		l_shipdate >= date '1996-01-01'
		and l_shipdate < date '1996-01-01' + interval '3' month
	group by
		l_suppkey)
select
	s_suppkey,
	s_name,
	s_address,
	s_phone,
	total_revenue
from
	supplier,
	revenue0
where
	s_suppkey = supplier_no
	and total_revenue = (
		select
			max(total_revenue)
		from
			revenue0
	)
order by
	s_suppkey

Spark Plan with Pushdown turned off

read execution

Spark Plan with Pushdown turned on, Sharding pushdown off

read execution

Project (1)
+- BatchScan (2)

(2) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select "S_SUPPKEY", "S_NAME", "S_ADDRESS", "S_PHONE", "total_revenue"
from TPCH.SUPPLIER  join ( select "supplier_no", "total_revenue"
from ( select "L_SUPPKEY" AS "supplier_no", SUM(("L_EXTENDEDPRICE" * cast((1.00000000000000000 - cast("L_DISCOUNT" as NUMBER(38, 17))) as NUMBER(38, 18)))) AS "total_revenue"
from TPCH.LINEITEM 
where (("L_SHIPDATE" >= TRUNC(TIMESTAMP '1996-01-01 08:00:00.000000')) AND ("L_SHIPDATE" < TRUNC(TIMESTAMP '1996-04-01 08:00:00.000000')))
group by "L_SUPPKEY" ) 
where ("total_revenue" IS NOT NULL AND ("total_revenue" =  ( select MAX("total_revenue") AS "max(total_revenue)"
from ( select SUM(("L_EXTENDEDPRICE" * cast((1.00000000000000000 - cast("L_DISCOUNT" as NUMBER(38, 17))) as NUMBER(38, 18)))) AS "total_revenue"
from TPCH.LINEITEM 
where (("L_SHIPDATE" >= TRUNC(TIMESTAMP '1996-01-01 08:00:00.000000')) AND ("L_SHIPDATE" < TRUNC(TIMESTAMP '1996-04-01 08:00:00.000000')))
group by "L_SUPPKEY" )  ))) )  on ("S_SUPPKEY" = "supplier_no")
order by "S_SUPPKEY" ASC NULLS FIRST
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 485272, bytes=42218664
Sharding details:
  shard instances cost=52655178, total query cost=26479359
  shard instances time(secs)=2058, total query time(secs)=1035
  num of shard queries in plan=2, joins in coordinator=true, table scans in coordinator=true
Pushdown Oracle SQL, Query Splitting details:
Query executed on Shard Coordinator

Spark Plan with Pushdown turned on, Sharding pushdown on

read execution

Sort (1)
+- Project (2)
   +- SortMergeJoin Inner (3)
      :- Project (4)
      :  +- BatchScan (5)
      +- Filter (6)
         :  +- Aggregate (7)
         :     +- Aggregate (8)
         :        +- Project (9)
         :           +- Filter (10)
         :              +- DataSourceV2ScanRelation (11)
         +- HashAggregate (12)
            +- HashAggregate (13)
               +- Project (14)
                  +- BatchScan (15)

(5) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select "S_SUPPKEY", "S_NAME", "S_ADDRESS", "S_PHONE"
from TPCH.SUPPLIER 
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 10000000, bytes=680000000
Pushdown Oracle SQL, Query Splitting details:
Replicated Query is run on instance: DataSourceKey(jdbc:oracle:thin:@//denad192:1521/sh3.us.oracle.com,tpch)
Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad192:1521/sh3.us.oracle.com,tpch) Split Strategy:
Query split by result rows
scn: 109978642

Splits:
offset = 0, numRows=2500000
offset = 2500000, numRows=2500000
offset = 5000000, numRows=2500000
offset = 7500000, numRows=-1
(15) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select "L_SUPPKEY", "L_EXTENDEDPRICE", "L_DISCOUNT"
from TPCH.LINEITEM 
where (("L_SHIPDATE" >= TRUNC(TIMESTAMP '1996-01-01 08:00:00.000000')) AND ("L_SHIPDATE" < TRUNC(TIMESTAMP '1996-04-01 08:00:00.000000')))
Pushdown Oracle SQL, Query Splitting details:
Sharded Query is run on 3 instances
Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad190:1521/sh1.us.oracle.com,tpch) Split Strategy:
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 5000662, bytes=115015226
split target candidates:
name=LINEITEM, row_count=5000662, bytes=115015226, partitions=(1, 90)
Query split by row ids
Target table:
name=LINEITEM, row_count=5000662, bytes=115015226, partitions=(1, 90)
Splits:
start-rowId = AAD0MQAAAAABOKAAAA, end-rowId=AAD0MmAAAAAC6sMH//
start-rowId = AAD0MmAAAAAC6sNAAA, end-rowId=AAD0MtAAAAAFX//H//

Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad191:1521/sh2.us.oracle.com,tpch) Split Strategy:
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 4999396, bytes=114986108
split target candidates:
name=LINEITEM, row_count=4999396, bytes=114986108, partitions=(1, 90)
Query split by row ids
Target table:
name=LINEITEM, row_count=4999396, bytes=114986108, partitions=(1, 90)
Splits:
start-rowId = AAHoVuAAAAABOqAAAA, end-rowId=AAHoWEAAAAAC6qkH//
start-rowId = AAHoWEAAAAAC6qlAAA, end-rowId=AAHoWLAAAAAFX//H//

Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad192:1521/sh3.us.oracle.com,tpch) Split Strategy:
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 4999916, bytes=114998068
split target candidates:
name=LINEITEM, row_count=4999916, bytes=114998068, partitions=(1, 90)
Query split by row ids
Target table:
name=LINEITEM, row_count=4999916, bytes=114998068, partitions=(1, 90)
Splits:
start-rowId = AALcfMAAAAABOqAAAA, end-rowId=AALcfiAAAAAC6sPH//
start-rowId = AALcfiAAAAAC6sQAAA, end-rowId=AALcfpAAAAAFX//H//

Query q16

Spark SQL

select
	p_brand,
	p_type,
	p_size,
	count(distinct ps_suppkey) as supplier_cnt
from
	partsupp,
	part
where
	p_partkey = ps_partkey
	and p_brand <> 'Brand#45'
	and p_type not like 'MEDIUM POLISHED%'
	and p_size in (49, 14, 23, 45, 19, 3, 36, 9)
	and ps_suppkey not in (
		select
			s_suppkey
		from
			supplier
		where
			s_comment like '%Customer%Complaints%'
	)
group by
	p_brand,
	p_type,
	p_size
order by
	supplier_cnt desc,
	p_brand,
	p_type,
	p_size

Spark Plan with Pushdown turned off

read execution

Spark Plan with Pushdown turned on, Sharding pushdown off

read execution

Project (1)
+- BatchScan (2)

(2) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select "P_BRAND", "P_TYPE", "P_SIZE", COUNT(DISTINCT "PS_SUPPKEY") AS "supplier_cnt"
from TPCH.PARTSUPP  join TPCH.PART  on ("PS_PARTKEY" = "P_PARTKEY")
where ( "PS_SUPPKEY" NOT IN ( select "S_SUPPKEY"
from TPCH.SUPPLIER 
where  ("S_COMMENT" LIKE '%Customer%Complaints%' ESCAPE '\')  ) AND ((NOT("P_BRAND" = 'Brand#45') AND NOT("P_TYPE" LIKE CONCAT('MEDIUM POLISHED' , '%'))) AND "P_SIZE" IN ( 49.000000000000000000, 14.000000000000000000, 23.000000000000000000, 45.000000000000000000, 19.000000000000000000, 3.000000000000000000, 36.000000000000000000, 9.000000000000000000 )))
group by "P_BRAND", "P_TYPE", "P_SIZE"
order by "supplier_cnt" DESC NULLS LAST, "P_BRAND" ASC NULLS FIRST, "P_TYPE" ASC NULLS FIRST, "P_SIZE" ASC NULLS FIRST
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 15000, bytes=705000
Pushdown Oracle SQL, Query Splitting details:
Replicated Query is run on instance: DataSourceKey(jdbc:oracle:thin:@//denad192:1521/sh3.us.oracle.com,tpch)
Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad192:1521/sh3.us.oracle.com,tpch) Split Strategy:
Query is not split

Spark Plan with Pushdown turned on, Sharding pushdown on

read execution

Project (1)
+- BatchScan (2)

(2) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select "P_BRAND", "P_TYPE", "P_SIZE", COUNT(DISTINCT "PS_SUPPKEY") AS "supplier_cnt"
from TPCH.PARTSUPP  join TPCH.PART  on ("PS_PARTKEY" = "P_PARTKEY")
where ( "PS_SUPPKEY" NOT IN ( select "S_SUPPKEY"
from TPCH.SUPPLIER 
where  ("S_COMMENT" LIKE '%Customer%Complaints%' ESCAPE '\')  ) AND ((NOT("P_BRAND" = 'Brand#45') AND NOT("P_TYPE" LIKE CONCAT('MEDIUM POLISHED' , '%'))) AND "P_SIZE" IN ( 49.000000000000000000, 14.000000000000000000, 23.000000000000000000, 45.000000000000000000, 19.000000000000000000, 3.000000000000000000, 36.000000000000000000, 9.000000000000000000 )))
group by "P_BRAND", "P_TYPE", "P_SIZE"
order by "supplier_cnt" DESC NULLS LAST, "P_BRAND" ASC NULLS FIRST, "P_TYPE" ASC NULLS FIRST, "P_SIZE" ASC NULLS FIRST
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 15000, bytes=705000
Pushdown Oracle SQL, Query Splitting details:
Replicated Query is run on instance: DataSourceKey(jdbc:oracle:thin:@//denad191:1521/sh2.us.oracle.com,tpch)
Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad191:1521/sh2.us.oracle.com,tpch) Split Strategy:
Query is not split

Query q17

Spark SQL

select
	sum(l_extendedprice) / 7.0 as avg_yearly
from
	lineitem,
	part
where
	p_partkey = l_partkey
	and p_brand = 'Brand#23'
	and p_container = 'MED BOX'
	and l_quantity < (
		select
			0.2 * avg(l_quantity)
		from
			lineitem
		where
			l_partkey = p_partkey
	)

Spark Plan with Pushdown turned off

read execution

Spark Plan with Pushdown turned on, Sharding pushdown off

read execution

Project (1)
+- BatchScan (2)

(2) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select (SUM("L_EXTENDEDPRICE") / 7.000000000000000000) AS "avg_yearly"
from TPCH.LINEITEM "sparkora_0" join TPCH.PART  on ("sparkora_0"."L_PARTKEY" = "P_PARTKEY") join ( select "1_sparkora", "L_PARTKEY"
from ( select (0.2000000000000000000000 * AVG("L_QUANTITY")) AS "1_sparkora", "L_PARTKEY"
from TPCH.LINEITEM 
group by "L_PARTKEY" ) 
where "1_sparkora" IS NOT NULL ) "sparkora_2" on (("P_PARTKEY" = "sparkora_2"."L_PARTKEY") AND (cast("L_QUANTITY" as NUMBER(38, 21)) < "1_sparkora"))
where (("P_BRAND" = 'Brand#23') AND ("P_CONTAINER" = 'MED BOX'))
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 1, bytes=90
Sharding details:
  shard instances cost=52603576, total query cost=81036995
  shard instances time(secs)=2056, total query time(secs)=3166
  num of shard queries in plan=2, joins in coordinator=true, table scans in coordinator=true
Pushdown Oracle SQL, Query Splitting details:
Query executed on Shard Coordinator

Spark Plan with Pushdown turned on, Sharding pushdown on

read execution

HashAggregate (1)
+- HashAggregate (2)
   +- Project (3)
      +- SortMergeJoin Inner (4)
         :- Project (5)
         :  +- BatchScan (6)
         +- Filter (7)
            +- HashAggregate (8)
               +- HashAggregate (9)
                  +- Project (10)
                     +- BatchScan (11)

(6) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select "L_QUANTITY", "L_EXTENDEDPRICE", "P_PARTKEY"
from TPCH.LINEITEM  join TPCH.PART  on ("L_PARTKEY" = "P_PARTKEY")
where (("P_BRAND" = 'Brand#23') AND ("P_CONTAINER" = 'MED BOX'))
Pushdown Oracle SQL, Query Splitting details:
Sharded Query is run on 3 instances
Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad190:1521/sh1.us.oracle.com,tpch) Split Strategy:
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 1967806, bytes=80680046
split target candidates:
name=LINEITEM, row_count=2000264870, bytes=32004237920, partitions=(1, 90)
Query split by row ids
Target table:
name=LINEITEM, row_count=2000264870, bytes=32004237920, partitions=(1, 90)
Splits:
start-rowId = AAD0MQAAAAABOKAAAA, end-rowId=AAD0MmAAAAAC6sMH//
start-rowId = AAD0MmAAAAAC6sNAAA, end-rowId=AAD0MtAAAAAFX//H//

Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad191:1521/sh2.us.oracle.com,tpch) Split Strategy:
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 1967308, bytes=80659628
split target candidates:
name=LINEITEM, row_count=1999758476, bytes=31996135616, partitions=(1, 90)
Query split by row ids
Target table:
name=LINEITEM, row_count=1999758476, bytes=31996135616, partitions=(1, 90)
Splits:
start-rowId = AAHoVuAAAAABOqAAAA, end-rowId=AAHoWEAAAAAC6qkH//
start-rowId = AAHoWEAAAAAC6qlAAA, end-rowId=AAHoWLAAAAAFX//H//

Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad192:1521/sh3.us.oracle.com,tpch) Split Strategy:
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 1967512, bytes=80667992
split target candidates:
name=LINEITEM, row_count=1999966363, bytes=31999461808, partitions=(1, 90)
Query split by row ids
Target table:
name=LINEITEM, row_count=1999966363, bytes=31999461808, partitions=(1, 90)
Splits:
start-rowId = AALcfMAAAAABOqAAAA, end-rowId=AALcfiAAAAAC6sPH//
start-rowId = AALcfiAAAAAC6sQAAA, end-rowId=AALcfpAAAAAFX//H//
(11) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select "L_PARTKEY", "L_QUANTITY"
from TPCH.LINEITEM 
Pushdown Oracle SQL, Query Splitting details:
Sharded Query is run on 3 instances
Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad190:1521/sh1.us.oracle.com,tpch) Split Strategy:
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 2000264870, bytes=20002648700
split target candidates:
name=LINEITEM, row_count=2000264870, bytes=20002648700, partitions=(1, 90)
Query split by row ids
Target table:
name=LINEITEM, row_count=2000264870, bytes=20002648700, partitions=(1, 90)
Splits:
start-rowId = AAD0MQAAAAABOKAAAA, end-rowId=AAD0MmAAAAAC6sMH//
start-rowId = AAD0MmAAAAAC6sNAAA, end-rowId=AAD0MtAAAAAFX//H//

Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad191:1521/sh2.us.oracle.com,tpch) Split Strategy:
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 1999758476, bytes=19997584760
split target candidates:
name=LINEITEM, row_count=1999758476, bytes=19997584760, partitions=(1, 90)
Query split by row ids
Target table:
name=LINEITEM, row_count=1999758476, bytes=19997584760, partitions=(1, 90)
Splits:
start-rowId = AAHoVuAAAAABOqAAAA, end-rowId=AAHoWEAAAAAC6qkH//
start-rowId = AAHoWEAAAAAC6qlAAA, end-rowId=AAHoWLAAAAAFX//H//

Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad192:1521/sh3.us.oracle.com,tpch) Split Strategy:
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 1999966363, bytes=19999663630
split target candidates:
name=LINEITEM, row_count=1999966363, bytes=19999663630, partitions=(1, 90)
Query split by row ids
Target table:
name=LINEITEM, row_count=1999966363, bytes=19999663630, partitions=(1, 90)
Splits:
start-rowId = AALcfMAAAAABOqAAAA, end-rowId=AALcfiAAAAAC6sPH//
start-rowId = AALcfiAAAAAC6sQAAA, end-rowId=AALcfpAAAAAFX//H//

Query q18

Spark SQL

select
	c_name,
	c_custkey,
	o_orderkey,
	o_orderdate,
	o_totalprice,
	sum(l_quantity)
from
	customer,
	orders,
	lineitem
where
	o_orderkey in (
		select
			l_orderkey
		from
			lineitem
		group by
			l_orderkey having
				sum(l_quantity) > 300
	)
	and c_custkey = o_custkey
	and o_orderkey = l_orderkey
group by
	c_name,
	c_custkey,
	o_orderkey,
	o_orderdate,
	o_totalprice
order by
	o_totalprice desc,
	o_orderdate
limit 100

Spark Plan with Pushdown turned off

read execution

Spark Plan with Pushdown turned on, Sharding pushdown off

read execution

Project (1)
+- BatchScan (2)

(2) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select "C_NAME", "C_CUSTKEY", "O_ORDERKEY", "O_ORDERDATE", "O_TOTALPRICE", "sum(l_quantity)"
from ( select "C_NAME", "C_CUSTKEY", "O_ORDERKEY", "O_ORDERDATE", "O_TOTALPRICE", SUM("L_QUANTITY") AS "sum(l_quantity)"
from TPCH.CUSTOMER  join TPCH.ORDERS  on ("C_CUSTKEY" = "O_CUSTKEY") join TPCH.LINEITEM  on ("O_ORDERKEY" = "L_ORDERKEY")
where ( "O_ORDERKEY" IN ( select "L_ORDERKEY"
from ( select "L_ORDERKEY", SUM("L_QUANTITY") AS "1_sparkora"
from TPCH.LINEITEM 
group by "L_ORDERKEY" ) 
where ("1_sparkora" IS NOT NULL AND ("1_sparkora" > 300.000000000000000000)) ) AND  "L_ORDERKEY" IN ( select "L_ORDERKEY"
from ( select "L_ORDERKEY", SUM("L_QUANTITY") AS "1_sparkora"
from TPCH.LINEITEM 
group by "L_ORDERKEY" ) 
where ("1_sparkora" IS NOT NULL AND ("1_sparkora" > 300.000000000000000000)) ))
group by "C_NAME", "C_CUSTKEY", "O_ORDERKEY", "O_ORDERDATE", "O_TOTALPRICE"
order by "O_TOTALPRICE" DESC NULLS LAST, "O_ORDERDATE" ASC NULLS FIRST ) 
where rownum <= 100
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 100, bytes=7500
Sharding details:
  shard instances cost=95429585, total query cost=140543135
  shard instances time(secs)=3730, total query time(secs)=5490
  num of shard queries in plan=3, joins in coordinator=true, table scans in coordinator=true
Pushdown Oracle SQL, Query Splitting details:
Query executed on Shard Coordinator

Spark Plan with Pushdown turned on, Sharding pushdown on

read execution

TakeOrderedAndProject (1)
+- Project (2)
   +- BatchScan (3)

(3) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select "C_NAME", "C_CUSTKEY", "O_ORDERKEY", "O_ORDERDATE", "O_TOTALPRICE", SUM("L_QUANTITY") AS "sum(l_quantity)"
from TPCH.CUSTOMER  join TPCH.ORDERS  on ("C_CUSTKEY" = "O_CUSTKEY") join TPCH.LINEITEM  on ("O_ORDERKEY" = "L_ORDERKEY")
where ( "O_ORDERKEY" IN ( select "L_ORDERKEY"
from ( select "L_ORDERKEY", SUM("L_QUANTITY") AS "1_sparkora"
from TPCH.LINEITEM 
group by "L_ORDERKEY" ) 
where ("1_sparkora" IS NOT NULL AND ("1_sparkora" > 300.000000000000000000)) ) AND  "L_ORDERKEY" IN ( select "L_ORDERKEY"
from ( select "L_ORDERKEY", SUM("L_QUANTITY") AS "1_sparkora"
from TPCH.LINEITEM 
group by "L_ORDERKEY" ) 
where ("1_sparkora" IS NOT NULL AND ("1_sparkora" > 300.000000000000000000)) ))
group by "C_NAME", "C_CUSTKEY", "O_ORDERKEY", "O_ORDERDATE", "O_TOTALPRICE"
Pushdown Oracle SQL, Query Splitting details:
Sharded Query is run on 3 instances
Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad190:1521/sh1.us.oracle.com,tpch) Split Strategy:
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 5000663, bytes=460060996
Query split by result rows
scn: 109978569

Splits:
offset = 0, numRows=1250166
offset = 1250166, numRows=1250166
offset = 2500332, numRows=1250166
offset = 3750498, numRows=-1

Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad191:1521/sh2.us.oracle.com,tpch) Split Strategy:
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 5059643, bytes=465487156
Query split by result rows
scn: 109978882

Splits:
offset = 0, numRows=1264911
offset = 1264911, numRows=1264911
offset = 2529822, numRows=1264911
offset = 3794733, numRows=-1

Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad192:1521/sh3.us.oracle.com,tpch) Split Strategy:
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 4999916, bytes=459992272
Query split by result rows
scn: 109978766

Splits:
offset = 0, numRows=1249979
offset = 1249979, numRows=1249979
offset = 2499958, numRows=1249979
offset = 3749937, numRows=-1

Query q19

Spark SQL

select
	sum(l_extendedprice* (1 - l_discount)) as revenue
from
	lineitem,
	part
where
	(
		p_partkey = l_partkey
		and p_brand = 'Brand#12'
		and p_container in ('SM CASE', 'SM BOX', 'SM PACK', 'SM PKG')
		and l_quantity >= 1 and l_quantity <= 1 + 10
		and p_size between 1 and 5
		and l_shipmode in ('AIR', 'AIR REG')
		and l_shipinstruct = 'DELIVER IN PERSON'
	)
	or
	(
		p_partkey = l_partkey
		and p_brand = 'Brand#23'
		and p_container in ('MED BAG', 'MED BOX', 'MED PKG', 'MED PACK')
		and l_quantity >= 10 and l_quantity <= 10 + 10
		and p_size between 1 and 10
		and l_shipmode in ('AIR', 'AIR REG')
		and l_shipinstruct = 'DELIVER IN PERSON'
	)
	or
	(
		p_partkey = l_partkey
		and p_brand = 'Brand#34'
		and p_container in ('LG CASE', 'LG BOX', 'LG PACK', 'LG PKG')
		and l_quantity >= 20 and l_quantity <= 20 + 10
		and p_size between 1 and 15
		and l_shipmode in ('AIR', 'AIR REG')
		and l_shipinstruct = 'DELIVER IN PERSON'
	)

Spark Plan with Pushdown turned off

read execution

Spark Plan with Pushdown turned on, Sharding pushdown off

read execution

Project (1)
+- BatchScan (2)

(2) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select SUM(("L_EXTENDEDPRICE" * cast((1.00000000000000000 - cast("L_DISCOUNT" as NUMBER(38, 17))) as NUMBER(38, 18)))) AS "revenue"
from TPCH.LINEITEM  join ( select "P_PARTKEY", "P_BRAND", "P_SIZE", "P_CONTAINER"
from TPCH.PART 
where (("P_SIZE" >= 1.000000000000000000) AND ((((("P_BRAND" = 'Brand#12') AND "P_CONTAINER" IN ( 'SM CASE', 'SM BOX', 'SM PACK', 'SM PKG' )) AND ("P_SIZE" <= 5.000000000000000000)) OR ((("P_BRAND" = 'Brand#23') AND "P_CONTAINER" IN ( 'MED BAG', 'MED BOX', 'MED PKG', 'MED PACK' )) AND ("P_SIZE" <= 10.000000000000000000))) OR ((("P_BRAND" = 'Brand#34') AND "P_CONTAINER" IN ( 'LG CASE', 'LG BOX', 'LG PACK', 'LG PKG' )) AND ("P_SIZE" <= 15.000000000000000000)))) )  on (("L_PARTKEY" = "P_PARTKEY") AND ((((((("P_BRAND" = 'Brand#12') AND "P_CONTAINER" IN ( 'SM CASE', 'SM BOX', 'SM PACK', 'SM PKG' )) AND ("L_QUANTITY" >= 1.000000000000000000)) AND ("L_QUANTITY" <= 11.000000000000000000)) AND ("P_SIZE" <= 5.000000000000000000)) OR ((((("P_BRAND" = 'Brand#23') AND "P_CONTAINER" IN ( 'MED BAG', 'MED BOX', 'MED PKG', 'MED PACK' )) AND ("L_QUANTITY" >= 10.000000000000000000)) AND ("L_QUANTITY" <= 20.000000000000000000)) AND ("P_SIZE" <= 10.000000000000000000))) OR ((((("P_BRAND" = 'Brand#34') AND "P_CONTAINER" IN ( 'LG CASE', 'LG BOX', 'LG PACK', 'LG PKG' )) AND ("L_QUANTITY" >= 20.000000000000000000)) AND ("L_QUANTITY" <= 30.000000000000000000)) AND ("P_SIZE" <= 15.000000000000000000))))
where (("L_SHIPMODE" IN ( 'AIR', 'AIR REG' ) AND ("L_SHIPINSTRUCT" = 'DELIVER IN PERSON')) AND (((("L_QUANTITY" >= 1.000000000000000000) AND ("L_QUANTITY" <= 11.000000000000000000)) OR (("L_QUANTITY" >= 10.000000000000000000) AND ("L_QUANTITY" <= 20.000000000000000000))) OR (("L_QUANTITY" >= 20.000000000000000000) AND ("L_QUANTITY" <= 30.000000000000000000))))
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 1, bytes=13
Sharding details:
  shard instances cost=27269124, total query cost=27269124
  shard instances time(secs)=1066, total query time(secs)=1066
  num of shard queries in plan=1, joins in coordinator=false, table scans in coordinator=false
Pushdown Oracle SQL, Query Splitting details:
Query executed on Shard Coordinator

Spark Plan with Pushdown turned on, Sharding pushdown on

read execution

Project (1)
+- BatchScan (2)

(2) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select SUM(("L_EXTENDEDPRICE" * cast((1.00000000000000000 - cast("L_DISCOUNT" as NUMBER(38, 17))) as NUMBER(38, 18)))) AS "revenue"
from TPCH.LINEITEM  join ( select "P_PARTKEY", "P_BRAND", "P_SIZE", "P_CONTAINER"
from TPCH.PART 
where (("P_SIZE" >= 1.000000000000000000) AND ((((("P_BRAND" = 'Brand#12') AND "P_CONTAINER" IN ( 'SM CASE', 'SM BOX', 'SM PACK', 'SM PKG' )) AND ("P_SIZE" <= 5.000000000000000000)) OR ((("P_BRAND" = 'Brand#23') AND "P_CONTAINER" IN ( 'MED BAG', 'MED BOX', 'MED PKG', 'MED PACK' )) AND ("P_SIZE" <= 10.000000000000000000))) OR ((("P_BRAND" = 'Brand#34') AND "P_CONTAINER" IN ( 'LG CASE', 'LG BOX', 'LG PACK', 'LG PKG' )) AND ("P_SIZE" <= 15.000000000000000000)))) )  on (("L_PARTKEY" = "P_PARTKEY") AND ((((((("P_BRAND" = 'Brand#12') AND "P_CONTAINER" IN ( 'SM CASE', 'SM BOX', 'SM PACK', 'SM PKG' )) AND ("L_QUANTITY" >= 1.000000000000000000)) AND ("L_QUANTITY" <= 11.000000000000000000)) AND ("P_SIZE" <= 5.000000000000000000)) OR ((((("P_BRAND" = 'Brand#23') AND "P_CONTAINER" IN ( 'MED BAG', 'MED BOX', 'MED PKG', 'MED PACK' )) AND ("L_QUANTITY" >= 10.000000000000000000)) AND ("L_QUANTITY" <= 20.000000000000000000)) AND ("P_SIZE" <= 10.000000000000000000))) OR ((((("P_BRAND" = 'Brand#34') AND "P_CONTAINER" IN ( 'LG CASE', 'LG BOX', 'LG PACK', 'LG PKG' )) AND ("L_QUANTITY" >= 20.000000000000000000)) AND ("L_QUANTITY" <= 30.000000000000000000)) AND ("P_SIZE" <= 15.000000000000000000))))
where (("L_SHIPMODE" IN ( 'AIR', 'AIR REG' ) AND ("L_SHIPINSTRUCT" = 'DELIVER IN PERSON')) AND (((("L_QUANTITY" >= 1.000000000000000000) AND ("L_QUANTITY" <= 11.000000000000000000)) OR (("L_QUANTITY" >= 10.000000000000000000) AND ("L_QUANTITY" <= 20.000000000000000000))) OR (("L_QUANTITY" >= 20.000000000000000000) AND ("L_QUANTITY" <= 30.000000000000000000))))
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 1, bytes=13
Sharding details:
  shard instances cost=27269124, total query cost=27269124
  shard instances time(secs)=1066, total query time(secs)=1066
  num of shard queries in plan=1, joins in coordinator=false, table scans in coordinator=false
Pushdown Oracle SQL, Query Splitting details:
Query executed on Shard Coordinator

Query q20

Spark SQL

select
	s_name,
	s_address
from
	supplier,
	nation
where
	s_suppkey in (
		select
			ps_suppkey
		from
			partsupp
		where
			ps_partkey in (
				select
					p_partkey
				from
					part
				where
					p_name like 'forest%'
			)
			and ps_availqty > (
				select
					0.5 * sum(l_quantity)
				from
					lineitem
				where
					l_partkey = ps_partkey
					and l_suppkey = ps_suppkey
					and l_shipdate >= date '1994-01-01'
					and l_shipdate < date '1994-01-01' + interval '1' year
			)
	)
	and s_nationkey = n_nationkey
	and n_name = 'CANADA'
order by
	s_name

Spark Plan with Pushdown turned off

read execution

Spark Plan with Pushdown turned on, Sharding pushdown off

read execution

Project (1)
+- BatchScan (2)

(2) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select "S_NAME", "S_ADDRESS"
from TPCH.SUPPLIER  join TPCH.NATION  on ("S_NATIONKEY" = "N_NATIONKEY")
where ( "S_SUPPKEY" IN ( select "PS_SUPPKEY"
from TPCH.PARTSUPP  join ( select "1_sparkora", "L_PARTKEY", "L_SUPPKEY"
from ( select (0.500000000000000000 * SUM("L_QUANTITY")) AS "1_sparkora", "L_PARTKEY", "L_SUPPKEY"
from TPCH.LINEITEM 
where ((("L_SHIPDATE" >= TRUNC(TIMESTAMP '1994-01-01 08:00:00.000000')) AND ("L_SHIPDATE" < TRUNC(TIMESTAMP '1995-01-01 08:00:00.000000'))) AND  "L_PARTKEY" IN ( select "P_PARTKEY"
from TPCH.PART 
where ("P_NAME" LIKE CONCAT('forest' , '%')) ))
group by "L_PARTKEY", "L_SUPPKEY" ) 
where "1_sparkora" IS NOT NULL )  on ((("PS_PARTKEY" = "L_PARTKEY") AND ("PS_SUPPKEY" = "L_SUPPKEY")) AND ("PS_AVAILQTY" > cast("1_sparkora" as NUMBER(38, 18))))
where  "PS_PARTKEY" IN ( select "P_PARTKEY"
from TPCH.PART 
where ("P_NAME" LIKE CONCAT('forest' , '%')) ) ) AND ("N_NAME" = 'CANADA'))
order by "S_NAME" ASC NULLS FIRST
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 4, bytes=292
Sharding details:
  shard instances cost=26327589, total query cost=34022376
  shard instances time(secs)=1029, total query time(secs)=1329
  num of shard queries in plan=1, joins in coordinator=true, table scans in coordinator=true
Pushdown Oracle SQL, Query Splitting details:
Query executed on Shard Coordinator

Spark Plan with Pushdown turned on, Sharding pushdown on

read execution

Sort (1)
+- Project (2)
   +- SortMergeJoin Inner (3)
      :- Project (4)
      :  +- SortMergeJoin LeftSemi (5)
      :     :- Project (6)
      :     :  +- BatchScan (7)
      :     +- Project (8)
      :        +- SortMergeJoin Inner (9)
      :           :- Project (10)
      :           :  +- BatchScan (11)
      :           +- Filter (12)
      :              +- HashAggregate (13)
      :                 +- HashAggregate (14)
      :                    +- Project (15)
      :                       +- BatchScan (16)
      +- Project (17)
         +- BatchScan (18)

(7) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select "S_SUPPKEY", "S_NAME", "S_ADDRESS", "S_NATIONKEY"
from TPCH.SUPPLIER 
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 10000000, bytes=550000000
Pushdown Oracle SQL, Query Splitting details:
Replicated Query is run on instance: DataSourceKey(jdbc:oracle:thin:@//denad191:1521/sh2.us.oracle.com,tpch)
Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad191:1521/sh2.us.oracle.com,tpch) Split Strategy:
Query split by result rows
scn: 109978894

Splits:
offset = 0, numRows=2500000
offset = 2500000, numRows=2500000
offset = 5000000, numRows=2500000
offset = 7500000, numRows=-1
(11) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select "PS_PARTKEY", "PS_SUPPKEY", "PS_AVAILQTY"
from TPCH.PARTSUPP 
where  "PS_PARTKEY" IN ( select "P_PARTKEY"
from TPCH.PART 
where ("P_NAME" LIKE CONCAT('forest' , '%')) )
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 40333822, bytes=2339361676
Pushdown Oracle SQL, Query Splitting details:
Replicated Query is run on instance: DataSourceKey(jdbc:oracle:thin:@//denad190:1521/sh1.us.oracle.com,tpch)
Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad190:1521/sh1.us.oracle.com,tpch) Split Strategy:
Query split by result rows
scn: 109978578

Splits:
offset = 0, numRows=10083456
offset = 10083456, numRows=10083456
offset = 20166912, numRows=10083456
offset = 30250368, numRows=-1
(16) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select "L_PARTKEY", "L_SUPPKEY", "L_QUANTITY"
from TPCH.LINEITEM 
where ((("L_SHIPDATE" >= TRUNC(TIMESTAMP '1994-01-01 08:00:00.000000')) AND ("L_SHIPDATE" < TRUNC(TIMESTAMP '1995-01-01 08:00:00.000000'))) AND  "L_PARTKEY" IN ( select "P_PARTKEY"
from TPCH.PART 
where ("P_NAME" LIKE CONCAT('forest' , '%')) ))
Pushdown Oracle SQL, Query Splitting details:
Sharded Query is run on 3 instances
Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad190:1521/sh1.us.oracle.com,tpch) Split Strategy:
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 5000662, bytes=325043030
Query split by result rows
scn: 109978588

Splits:
offset = 0, numRows=1250166
offset = 1250166, numRows=1250166
offset = 2500332, numRows=1250166
offset = 3750498, numRows=-1

Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad191:1521/sh2.us.oracle.com,tpch) Split Strategy:
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 4999396, bytes=324960740
Query split by result rows
scn: 109978900

Splits:
offset = 0, numRows=1249849
offset = 1249849, numRows=1249849
offset = 2499698, numRows=1249849
offset = 3749547, numRows=-1

Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad192:1521/sh3.us.oracle.com,tpch) Split Strategy:
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 4999916, bytes=324994540
Query split by result rows
scn: 109978775

Splits:
offset = 0, numRows=1249979
offset = 1249979, numRows=1249979
offset = 2499958, numRows=1249979
offset = 3749937, numRows=-1
(18) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select "N_NATIONKEY"
from TPCH.NATION 
where ("N_NAME" = 'CANADA')
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 1, bytes=12
Pushdown Oracle SQL, Query Splitting details:
Replicated Query is run on instance: DataSourceKey(jdbc:oracle:thin:@//denad190:1521/sh1.us.oracle.com,tpch)
Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad190:1521/sh1.us.oracle.com,tpch) Split Strategy:
Query is not split

Query q21

Spark SQL

select
	s_name,
	count(*) as numwait
from
	supplier,
	lineitem l1,
	orders,
	nation
where
	s_suppkey = l1.l_suppkey
	and o_orderkey = l1.l_orderkey
	and o_orderstatus = 'F'
	and l1.l_receiptdate > l1.l_commitdate
	and exists (
		select
			*
		from
			lineitem l2
		where
			l2.l_orderkey = l1.l_orderkey
			and l2.l_suppkey <> l1.l_suppkey
	)
	and not exists (
		select
			*
		from
			lineitem l3
		where
			l3.l_orderkey = l1.l_orderkey
			and l3.l_suppkey <> l1.l_suppkey
			and l3.l_receiptdate > l3.l_commitdate
	)
	and s_nationkey = n_nationkey
	and n_name = 'SAUDI ARABIA'
group by
	s_name
order by
	numwait desc,
	s_name
limit 100

Spark Plan with Pushdown turned off

read execution

Spark Plan with Pushdown turned on, Sharding pushdown off

read execution

TakeOrderedAndProject (1)
+- HashAggregate (2)
   +- HashAggregate (3)
      +- Project (4)
         +- SortMergeJoin Inner (5)
            :- Project (6)
            :  +- SortMergeJoin Inner (7)
            :     :- Project (8)
            :     :  +- SortMergeJoin Inner (9)
            :     :     :- Project (10)
            :     :     :  +- BatchScan (11)
            :     :     +- SortMergeJoin LeftAnti (12)
            :     :        :- SortMergeJoin LeftSemi (13)
            :     :        :  :- Project (14)
            :     :        :  :  +- BatchScan (15)
            :     :        :  +- Project (16)
            :     :        :     +- BatchScan (17)
            :     :        +- Project (18)
            :     :           +- BatchScan (19)
            :     +- Project (20)
            :        +- BatchScan (21)
            +- Project (22)
               +- BatchScan (23)

(11) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select "S_SUPPKEY", "S_NAME", "S_NATIONKEY"
from TPCH.SUPPLIER 
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 10000000, bytes=280000000
Pushdown Oracle SQL, Query Splitting details:
Replicated Query is run on instance: DataSourceKey(jdbc:oracle:thin:@//denad191:1521/sh2.us.oracle.com,tpch)
Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad191:1521/sh2.us.oracle.com,tpch) Split Strategy:
Query split by result rows
scn: 109978973

Splits:
offset = 0, numRows=2500000
offset = 2500000, numRows=2500000
offset = 5000000, numRows=2500000
offset = 7500000, numRows=-1
(15) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select "L_ORDERKEY", "L_SUPPKEY"
from TPCH.LINEITEM 
where ("L_RECEIPTDATE" > "L_COMMITDATE")
Pushdown Oracle SQL, Query Splitting details:
Sharded Query is run on 3 instances
Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad190:1521/sh1.us.oracle.com,tpch) Split Strategy:
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 1013832879, bytes=29401153491
split target candidates:
name=LINEITEM, row_count=1013832879, bytes=29401153491, partitions=(1, 90)
Query split by row ids
Target table:
name=LINEITEM, row_count=1013832879, bytes=29401153491, partitions=(1, 90)
Splits:
start-rowId = AAD0MQAAAAABOKAAAA, end-rowId=AAD0MmAAAAAC6sMH//
start-rowId = AAD0MmAAAAAC6sNAAA, end-rowId=AAD0MtAAAAAFX//H//

Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad191:1521/sh2.us.oracle.com,tpch) Split Strategy:
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 1013576214, bytes=29393710206
split target candidates:
name=LINEITEM, row_count=1013576214, bytes=29393710206, partitions=(1, 90)
Query split by row ids
Target table:
name=LINEITEM, row_count=1013576214, bytes=29393710206, partitions=(1, 90)
Splits:
start-rowId = AAHoVuAAAAABOqAAAA, end-rowId=AAHoWEAAAAAC6qkH//
start-rowId = AAHoWEAAAAAC6qlAAA, end-rowId=AAHoWLAAAAAFX//H//

Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad192:1521/sh3.us.oracle.com,tpch) Split Strategy:
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 1013681581, bytes=29396765849
split target candidates:
name=LINEITEM, row_count=1013681581, bytes=29396765849, partitions=(1, 90)
Query split by row ids
Target table:
name=LINEITEM, row_count=1013681581, bytes=29396765849, partitions=(1, 90)
Splits:
start-rowId = AALcfMAAAAABOqAAAA, end-rowId=AALcfiAAAAAC6sPH//
start-rowId = AALcfiAAAAAC6sQAAA, end-rowId=AALcfpAAAAAFX//H//
(17) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select "L_ORDERKEY", "L_SUPPKEY"
from TPCH.LINEITEM 
Pushdown Oracle SQL, Query Splitting details:
Sharded Query is run on 3 instances
Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad190:1521/sh1.us.oracle.com,tpch) Split Strategy:
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 2000264870, bytes=26003443310
split target candidates:
name=LINEITEM, row_count=2000264870, bytes=26003443310, partitions=(1, 90)
Query split by row ids
Target table:
name=LINEITEM, row_count=2000264870, bytes=26003443310, partitions=(1, 90)
Splits:
start-rowId = AAD0MQAAAAABOKAAAA, end-rowId=AAD0MmAAAAAC6sMH//
start-rowId = AAD0MmAAAAAC6sNAAA, end-rowId=AAD0MtAAAAAFX//H//

Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad191:1521/sh2.us.oracle.com,tpch) Split Strategy:
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 1999758476, bytes=25996860188
split target candidates:
name=LINEITEM, row_count=1999758476, bytes=25996860188, partitions=(1, 90)
Query split by row ids
Target table:
name=LINEITEM, row_count=1999758476, bytes=25996860188, partitions=(1, 90)
Splits:
start-rowId = AAHoVuAAAAABOqAAAA, end-rowId=AAHoWEAAAAAC6qkH//
start-rowId = AAHoWEAAAAAC6qlAAA, end-rowId=AAHoWLAAAAAFX//H//

Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad192:1521/sh3.us.oracle.com,tpch) Split Strategy:
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 1999966363, bytes=25999562719
split target candidates:
name=LINEITEM, row_count=1999966363, bytes=25999562719, partitions=(1, 90)
Query split by row ids
Target table:
name=LINEITEM, row_count=1999966363, bytes=25999562719, partitions=(1, 90)
Splits:
start-rowId = AALcfMAAAAABOqAAAA, end-rowId=AALcfiAAAAAC6sPH//
start-rowId = AALcfiAAAAAC6sQAAA, end-rowId=AALcfpAAAAAFX//H//
(19) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select "L_ORDERKEY", "L_SUPPKEY"
from TPCH.LINEITEM 
where ("L_RECEIPTDATE" > "L_COMMITDATE")
Pushdown Oracle SQL, Query Splitting details:
Sharded Query is run on 3 instances
Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad190:1521/sh1.us.oracle.com,tpch) Split Strategy:
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 1013832879, bytes=29401153491
split target candidates:
name=LINEITEM, row_count=1013832879, bytes=29401153491, partitions=(1, 90)
Query split by row ids
Target table:
name=LINEITEM, row_count=1013832879, bytes=29401153491, partitions=(1, 90)
Splits:
start-rowId = AAD0MQAAAAABOKAAAA, end-rowId=AAD0MmAAAAAC6sMH//
start-rowId = AAD0MmAAAAAC6sNAAA, end-rowId=AAD0MtAAAAAFX//H//

Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad191:1521/sh2.us.oracle.com,tpch) Split Strategy:
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 1013576214, bytes=29393710206
split target candidates:
name=LINEITEM, row_count=1013576214, bytes=29393710206, partitions=(1, 90)
Query split by row ids
Target table:
name=LINEITEM, row_count=1013576214, bytes=29393710206, partitions=(1, 90)
Splits:
start-rowId = AAHoVuAAAAABOqAAAA, end-rowId=AAHoWEAAAAAC6qkH//
start-rowId = AAHoWEAAAAAC6qlAAA, end-rowId=AAHoWLAAAAAFX//H//

Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad192:1521/sh3.us.oracle.com,tpch) Split Strategy:
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 1013681581, bytes=29396765849
split target candidates:
name=LINEITEM, row_count=1013681581, bytes=29396765849, partitions=(1, 90)
Query split by row ids
Target table:
name=LINEITEM, row_count=1013681581, bytes=29396765849, partitions=(1, 90)
Splits:
start-rowId = AALcfMAAAAABOqAAAA, end-rowId=AALcfiAAAAAC6sPH//
start-rowId = AALcfiAAAAAC6sQAAA, end-rowId=AALcfpAAAAAFX//H//
(21) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select "O_ORDERKEY"
from TPCH.ORDERS 
where ("O_ORDERSTATUS" = 'F')
Pushdown Oracle SQL, Query Splitting details:
Sharded Query is run on 3 instances
Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad190:1521/sh1.us.oracle.com,tpch) Split Strategy:
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 166682513, bytes=1500142617
split target candidates:
name=ORDERS, row_count=166682513, bytes=1500142617, partitions=(1, 90)
Query split by partitions

Target table:
name=ORDERS, row_count=166682513, bytes=1500142617, partitions=(1, 90)
Splits:
partitions = ORDERS_P1

Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad191:1521/sh2.us.oracle.com,tpch) Split Strategy:
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 166650560, bytes=1499855040
split target candidates:
name=ORDERS, row_count=166650560, bytes=1499855040, partitions=(1, 90)
Query split by partitions

Target table:
name=ORDERS, row_count=166650560, bytes=1499855040, partitions=(1, 90)
Splits:
partitions = ORDERS_P1

Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad192:1521/sh3.us.oracle.com,tpch) Split Strategy:
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 166666927, bytes=1500002343
split target candidates:
name=ORDERS, row_count=166666927, bytes=1500002343, partitions=(1, 90)
Query split by partitions

Target table:
name=ORDERS, row_count=166666927, bytes=1500002343, partitions=(1, 90)
Splits:
partitions = ORDERS_P1
(23) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select "N_NATIONKEY"
from TPCH.NATION 
where ("N_NAME" = 'SAUDI ARABIA')
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 1, bytes=12
Pushdown Oracle SQL, Query Splitting details:
Replicated Query is run on instance: DataSourceKey(jdbc:oracle:thin:@//denad191:1521/sh2.us.oracle.com,tpch)
Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad191:1521/sh2.us.oracle.com,tpch) Split Strategy:
Query is not split

Spark Plan with Pushdown turned on, Sharding pushdown on

read execution

TakeOrderedAndProject (1)
+- HashAggregate (2)
   +- HashAggregate (3)
      +- Project (4)
         +- SortMergeJoin Inner (5)
            :- Project (6)
            :  +- SortMergeJoin Inner (7)
            :     :- Project (8)
            :     :  +- SortMergeJoin Inner (9)
            :     :     :- Project (10)
            :     :     :  +- BatchScan (11)
            :     :     +- SortMergeJoin LeftAnti (12)
            :     :        :- SortMergeJoin LeftSemi (13)
            :     :        :  :- Project (14)
            :     :        :  :  +- BatchScan (15)
            :     :        :  +- Project (16)
            :     :        :     +- BatchScan (17)
            :     :        +- Project (18)
            :     :           +- BatchScan (19)
            :     +- Project (20)
            :        +- BatchScan (21)
            +- Project (22)
               +- BatchScan (23)

(11) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select "S_SUPPKEY", "S_NAME", "S_NATIONKEY"
from TPCH.SUPPLIER 
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 10000000, bytes=280000000
Pushdown Oracle SQL, Query Splitting details:
Replicated Query is run on instance: DataSourceKey(jdbc:oracle:thin:@//denad190:1521/sh1.us.oracle.com,tpch)
Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad190:1521/sh1.us.oracle.com,tpch) Split Strategy:
Query split by result rows
scn: 109979307

Splits:
offset = 0, numRows=2500000
offset = 2500000, numRows=2500000
offset = 5000000, numRows=2500000
offset = 7500000, numRows=-1
(15) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select "L_ORDERKEY", "L_SUPPKEY"
from TPCH.LINEITEM 
where ("L_RECEIPTDATE" > "L_COMMITDATE")
Pushdown Oracle SQL, Query Splitting details:
Sharded Query is run on 3 instances
Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad190:1521/sh1.us.oracle.com,tpch) Split Strategy:
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 1013832879, bytes=29401153491
split target candidates:
name=LINEITEM, row_count=1013832879, bytes=29401153491, partitions=(1, 90)
Query split by row ids
Target table:
name=LINEITEM, row_count=1013832879, bytes=29401153491, partitions=(1, 90)
Splits:
start-rowId = AAD0MQAAAAABOKAAAA, end-rowId=AAD0MmAAAAAC6sMH//
start-rowId = AAD0MmAAAAAC6sNAAA, end-rowId=AAD0MtAAAAAFX//H//

Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad191:1521/sh2.us.oracle.com,tpch) Split Strategy:
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 1013576214, bytes=29393710206
split target candidates:
name=LINEITEM, row_count=1013576214, bytes=29393710206, partitions=(1, 90)
Query split by row ids
Target table:
name=LINEITEM, row_count=1013576214, bytes=29393710206, partitions=(1, 90)
Splits:
start-rowId = AAHoVuAAAAABOqAAAA, end-rowId=AAHoWEAAAAAC6qkH//
start-rowId = AAHoWEAAAAAC6qlAAA, end-rowId=AAHoWLAAAAAFX//H//

Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad192:1521/sh3.us.oracle.com,tpch) Split Strategy:
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 1013681581, bytes=29396765849
split target candidates:
name=LINEITEM, row_count=1013681581, bytes=29396765849, partitions=(1, 90)
Query split by row ids
Target table:
name=LINEITEM, row_count=1013681581, bytes=29396765849, partitions=(1, 90)
Splits:
start-rowId = AALcfMAAAAABOqAAAA, end-rowId=AALcfiAAAAAC6sPH//
start-rowId = AALcfiAAAAAC6sQAAA, end-rowId=AALcfpAAAAAFX//H//
(17) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select "L_ORDERKEY", "L_SUPPKEY"
from TPCH.LINEITEM 
Pushdown Oracle SQL, Query Splitting details:
Sharded Query is run on 3 instances
Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad190:1521/sh1.us.oracle.com,tpch) Split Strategy:
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 2000264870, bytes=26003443310
split target candidates:
name=LINEITEM, row_count=2000264870, bytes=26003443310, partitions=(1, 90)
Query split by row ids
Target table:
name=LINEITEM, row_count=2000264870, bytes=26003443310, partitions=(1, 90)
Splits:
start-rowId = AAD0MQAAAAABOKAAAA, end-rowId=AAD0MmAAAAAC6sMH//
start-rowId = AAD0MmAAAAAC6sNAAA, end-rowId=AAD0MtAAAAAFX//H//

Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad191:1521/sh2.us.oracle.com,tpch) Split Strategy:
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 1999758476, bytes=25996860188
split target candidates:
name=LINEITEM, row_count=1999758476, bytes=25996860188, partitions=(1, 90)
Query split by row ids
Target table:
name=LINEITEM, row_count=1999758476, bytes=25996860188, partitions=(1, 90)
Splits:
start-rowId = AAHoVuAAAAABOqAAAA, end-rowId=AAHoWEAAAAAC6qkH//
start-rowId = AAHoWEAAAAAC6qlAAA, end-rowId=AAHoWLAAAAAFX//H//

Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad192:1521/sh3.us.oracle.com,tpch) Split Strategy:
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 1999966363, bytes=25999562719
split target candidates:
name=LINEITEM, row_count=1999966363, bytes=25999562719, partitions=(1, 90)
Query split by row ids
Target table:
name=LINEITEM, row_count=1999966363, bytes=25999562719, partitions=(1, 90)
Splits:
start-rowId = AALcfMAAAAABOqAAAA, end-rowId=AALcfiAAAAAC6sPH//
start-rowId = AALcfiAAAAAC6sQAAA, end-rowId=AALcfpAAAAAFX//H//
(19) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select "L_ORDERKEY", "L_SUPPKEY"
from TPCH.LINEITEM 
where ("L_RECEIPTDATE" > "L_COMMITDATE")
Pushdown Oracle SQL, Query Splitting details:
Sharded Query is run on 3 instances
Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad190:1521/sh1.us.oracle.com,tpch) Split Strategy:
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 1013832879, bytes=29401153491
split target candidates:
name=LINEITEM, row_count=1013832879, bytes=29401153491, partitions=(1, 90)
Query split by row ids
Target table:
name=LINEITEM, row_count=1013832879, bytes=29401153491, partitions=(1, 90)
Splits:
start-rowId = AAD0MQAAAAABOKAAAA, end-rowId=AAD0MmAAAAAC6sMH//
start-rowId = AAD0MmAAAAAC6sNAAA, end-rowId=AAD0MtAAAAAFX//H//

Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad191:1521/sh2.us.oracle.com,tpch) Split Strategy:
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 1013576214, bytes=29393710206
split target candidates:
name=LINEITEM, row_count=1013576214, bytes=29393710206, partitions=(1, 90)
Query split by row ids
Target table:
name=LINEITEM, row_count=1013576214, bytes=29393710206, partitions=(1, 90)
Splits:
start-rowId = AAHoVuAAAAABOqAAAA, end-rowId=AAHoWEAAAAAC6qkH//
start-rowId = AAHoWEAAAAAC6qlAAA, end-rowId=AAHoWLAAAAAFX//H//

Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad192:1521/sh3.us.oracle.com,tpch) Split Strategy:
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 1013681581, bytes=29396765849
split target candidates:
name=LINEITEM, row_count=1013681581, bytes=29396765849, partitions=(1, 90)
Query split by row ids
Target table:
name=LINEITEM, row_count=1013681581, bytes=29396765849, partitions=(1, 90)
Splits:
start-rowId = AALcfMAAAAABOqAAAA, end-rowId=AALcfiAAAAAC6sPH//
start-rowId = AALcfiAAAAAC6sQAAA, end-rowId=AALcfpAAAAAFX//H//
(21) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select "O_ORDERKEY"
from TPCH.ORDERS 
where ("O_ORDERSTATUS" = 'F')
Pushdown Oracle SQL, Query Splitting details:
Sharded Query is run on 3 instances
Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad190:1521/sh1.us.oracle.com,tpch) Split Strategy:
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 166682513, bytes=1500142617
split target candidates:
name=ORDERS, row_count=166682513, bytes=1500142617, partitions=(1, 90)
Query split by partitions

Target table:
name=ORDERS, row_count=166682513, bytes=1500142617, partitions=(1, 90)
Splits:
partitions = ORDERS_P1

Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad191:1521/sh2.us.oracle.com,tpch) Split Strategy:
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 166650560, bytes=1499855040
split target candidates:
name=ORDERS, row_count=166650560, bytes=1499855040, partitions=(1, 90)
Query split by partitions

Target table:
name=ORDERS, row_count=166650560, bytes=1499855040, partitions=(1, 90)
Splits:
partitions = ORDERS_P1

Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad192:1521/sh3.us.oracle.com,tpch) Split Strategy:
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 166666927, bytes=1500002343
split target candidates:
name=ORDERS, row_count=166666927, bytes=1500002343, partitions=(1, 90)
Query split by partitions

Target table:
name=ORDERS, row_count=166666927, bytes=1500002343, partitions=(1, 90)
Splits:
partitions = ORDERS_P1
(23) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select "N_NATIONKEY"
from TPCH.NATION 
where ("N_NAME" = 'SAUDI ARABIA')
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 1, bytes=12
Pushdown Oracle SQL, Query Splitting details:
Replicated Query is run on instance: DataSourceKey(jdbc:oracle:thin:@//denad190:1521/sh1.us.oracle.com,tpch)
Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad190:1521/sh1.us.oracle.com,tpch) Split Strategy:
Query is not split

Query q22

Spark SQL

select
	cntrycode,
	count(*) as numcust,
	sum(c_acctbal) as totacctbal
from
	(
		select
			substring(c_phone, 1, 2) as cntrycode,
			c_acctbal
		from
			customer
		where
			substring(c_phone, 1, 2) in
				('13', '31', '23', '29', '30', '18', '17')
			and c_acctbal > (
				select
					avg(c_acctbal)
				from
					customer
				where
					c_acctbal > 0.00
					and substring(c_phone, 1, 2) in
						('13', '31', '23', '29', '30', '18', '17')
			)
			and not exists (
				select
					*
				from
					orders
				where
					o_custkey = c_custkey
			)
	) as custsale
group by
	cntrycode
order by
	cntrycode

Spark Plan with Pushdown turned off

read execution

Spark Plan with Pushdown turned on, Sharding pushdown off

read execution

Project (1)
+- BatchScan (2)

(2) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select "cntrycode", COUNT(1) AS "numcust", SUM("C_ACCTBAL") AS "totacctbal"
from ( select SUBSTR("C_PHONE" , 1 , 2) AS "cntrycode", "C_ACCTBAL"
from TPCH.CUSTOMER "sparkora_0"
where ((SUBSTR("C_PHONE" , 1 , 2) IN ( '13', '31', '23', '29', '30', '18', '17' ) AND (cast("C_ACCTBAL" as NUMBER(38, 22)) >  ( select AVG("C_ACCTBAL") AS "avg(c_acctbal)"
from TPCH.CUSTOMER 
where (SUBSTR("C_PHONE" , 1 , 2) IN ( '13', '31', '23', '29', '30', '18', '17' ) AND ("C_ACCTBAL" > 0E-18)) ))) AND not exists ( select 1
from TPCH.ORDERS 
where ("sparkora_0"."C_CUSTKEY" = "O_CUSTKEY") )) ) 
group by "cntrycode"
order by "cntrycode" ASC NULLS FIRST
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 1, bytes=36
Sharding details:
  shard instances cost=6319547, total query cost=9546516
  shard instances time(secs)=247, total query time(secs)=373
  num of shard queries in plan=1, joins in coordinator=true, table scans in coordinator=true
Pushdown Oracle SQL, Query Splitting details:
Query executed on Shard Coordinator

Spark Plan with Pushdown turned on, Sharding pushdown on

read execution

Sort (1)
+- HashAggregate (2)
   +- HashAggregate (3)
      +- Project (4)
         +- BatchScan (5)

(5) BatchScan
Oracle Instance:
   DataSourceKey(jdbc:oracle:thin:@denad189:1521/cat.us.oracle.com,tpch)
Pushdown Oracle SQL:
select SUBSTR("C_PHONE" , 1 , 2) AS "cntrycode", "C_ACCTBAL"
from TPCH.CUSTOMER "sparkora_0"
where ((SUBSTR("C_PHONE" , 1 , 2) IN ( '13', '31', '23', '29', '30', '18', '17' ) AND (cast("C_ACCTBAL" as NUMBER(38, 22)) >  ( select AVG("C_ACCTBAL") AS "avg(c_acctbal)"
from TPCH.CUSTOMER 
where (SUBSTR("C_PHONE" , 1 , 2) IN ( '13', '31', '23', '29', '30', '18', '17' ) AND ("C_ACCTBAL" > 0E-18)) ))) AND not exists ( select 1
from TPCH.ORDERS 
where ("sparkora_0"."C_CUSTKEY" = "O_CUSTKEY") ))
Pushdown Oracle SQL, Query Splitting details:
Sharded Query is run on 3 instances
Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad190:1521/sh1.us.oracle.com,tpch) Split Strategy:
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 5095, bytes=183420
Query is not split

Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad191:1521/sh2.us.oracle.com,tpch) Split Strategy:
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 5095, bytes=183420
Query is not split

Shard Instance DataSourceKey(jdbc:oracle:thin:@//denad192:1521/sh3.us.oracle.com,tpch) Split Strategy:
Pushdown Oracle SQL, oracle plan stats estimates:
rowCount = 5095, bytes=183420
Query is not split

Clone this wiki locally