Skip to content

Latest commit

 

History

History
46 lines (38 loc) · 2.7 KB

ExplainCostsRouter.md

File metadata and controls

46 lines (38 loc) · 2.7 KB

ExplainCostsRouter

This router executes an explain {query} EXPLAIN query for every incoming query. Trino will respond with an resource estimation the query will consume. Please note that this functional heavily depends on Table statistics being present for the access tables to get meaningful estimations.

For this to work trino-lb executes explain (format json) {query} and sums up all the resource estimations of the children stages. This is a very simplistic model and will likely be improved in the future - we are happy about any suggestions on how the estimates should be calculated the best in the tracking issue.

After trino-lb got the query estimation, it walks a list of resource buckets you can specify top to bottom and picks the first one that fulfills all resource requirements (CPU, memory, Network traffic etc.). If no bucket matches the router will not a make a decision and let the routers further down the chain decide.

Warning

Please keep in mind that trino-lb will determine the target cluster group for every incoming query instantly (otherwise it does not know if it should queued or hand over the query). In case a Trino client submits many queries at once this will result in the same number of explain queries on the Trino used for the query estimations. There is a issue to address this, however until this is resolved it is recommended to have the ExplainCostsRouter near the end of the chain and try to classify the queries with a different router. However, this is only important if you have more queries/s than a Trino cluster can run explain queries for (which hopefully is pretty much).

Configuration

With this words of caution, lets jump into the configuration.

You need to specify the cluster that executes the explain queries. Currently only password based authentication is supported.

routers:
  - explainCosts:
      trinoClusterToRunExplainQuery:
        endpoint: https://trino-coordinator-default.default.svc.cluster.local:8443
        ignoreCert: true # optional, defaults to false
        username: admin
        password: adminadmin
      targets:
        - cpuCost: 5E+9
          memoryCost: 5E9 # 5GB
          networkCost: 10E9 # 10GB
          outputRowCount: 1E6
          outputSizeInBytes: 1E9 # 1GB
          trinoClusterGroup: s
        - cpuCost: 5E+12
          memoryCost: 5E12 # 5TB
          networkCost: 5E12 # 5TB
          outputRowCount: 1E9
          outputSizeInBytes: 5E12 # 5TB
          trinoClusterGroup: m