Skip to content
This repository has been archived by the owner on Feb 20, 2023. It is now read-only.

Optimizer does not successfully time out when query has a limit #1414

Open
jkosh44 opened this issue Dec 24, 2020 · 3 comments
Open

Optimizer does not successfully time out when query has a limit #1414

jkosh44 opened this issue Dec 24, 2020 · 3 comments
Labels
bug Something isn't working (correctness). Mark issues with this.

Comments

@jkosh44
Copy link
Contributor

jkosh44 commented Dec 24, 2020

Bug Report

Summary

When you have a complicated join with a LIMIT clause, then the optimizer fails to time out. Suppose that we have the tables from this ddl: https://github.com/oltpbenchmark/oltpbench/blob/master/src/com/oltpbenchmark/benchmarks/tpch/ddls/tpch-postgres-ddl.sql. Then if we run the following query:

select
        s_acctbal
from
        part,
        partsupp,
        supplier,
        nation,
        region,
        customer,
        orders,
        lineitem
where
        p_partkey = ps_partkey
        and ps_suppkey = s_suppkey 
        and s_nationkey = n_nationkey
        and n_regionkey = r_regionkey
        and n_nationkey = c_nationkey
        and c_custkey = o_custkey
        and o_orderkey = l_orderkey

Then we should expect that there are so many join orderings that DBMS optimizer will time out. In fact if you run that query, then it will time out, print the following in the console: [2020-12-23 22:05:03.397] [optimizer_logger] [warning] Optimize Loop ended prematurely: Optimizer task execution timed out, and return a result.

However if we run the following query (the only difference is the added LIMIT clause):

select
        s_acctbal
from
        part,
        partsupp,
        supplier,
        nation,
        region,
        customer,
        orders,
        lineitem
where
        p_partkey = ps_partkey
        and ps_suppkey = s_suppkey 
        and s_nationkey = n_nationkey
        and n_regionkey = r_regionkey
        and n_nationkey = c_nationkey
        and c_custkey = o_custkey
        and o_orderkey = l_orderkey
LIMIT 100;

Then the optimizer WILL NOT time out.

The offending line is right here:

if (elapsed_time >= task_execution_timeout_ && root_group->HasExpressions(required_props)) {
throw OPTIMIZER_EXCEPTION("Optimizer task execution timed out");
}

Although elapsed_time gets much greater than task_execution_timeout_, root_group->HasExpressions(required_props) will return false. The code for HasExpressions is here:
bool Group::HasExpressions(PropertySet *properties) const {
const auto &it = lowest_cost_expressions_.find(properties);
return (it != lowest_cost_expressions_.end());
}

For this query, lowest_cost_expressions_ is empty so HasExpressions will always return false. For the original query lowest_cost_expressions_ is not empty and contains some result. For both queries required_props is empty, which is a little weird to me, but I don't really know what required_props is, so this might be fine.

I don't know enough about the optimizer to know exactly why this is happening, but my guess at a very high conceptual level is, generating a physical plan for the LIMIT happens after we enumerate the join orderings. Since we don't have a physical plan for the limit, we can't time out during the join enumeration.

It's very likely that #1229 is suffering from this.

Environment

OS: Ubuntu (LTS) 20.04

Compiler: GCC 7.0+

CMake Profile: Debug

Jenkins/CI: N/A

Steps to Reproduce

  1. Compile with the following args: -DCMAKE_BUILD_TYPE=Debug -DTERRIER_USE_ASAN=ON
  2. Run terrier with parallel execution turned off terrier -parallel_execution=false
  3. Run the following: https://github.com/oltpbenchmark/oltpbench/blob/master/src/com/oltpbenchmark/benchmarks/tpch/ddls/tpch-postgres-ddl.sql
  4. Run:
select
        s_acctbal
from
        part,
        partsupp,
        supplier,
        nation,
        region,
        customer,
        orders,
        lineitem
where
        p_partkey = ps_partkey
        and ps_suppkey = s_suppkey 
        and s_nationkey = n_nationkey
        and n_regionkey = r_regionkey
        and n_nationkey = c_nationkey
        and c_custkey = o_custkey
        and o_orderkey = l_orderkey
LIMIT 100;

Expected Behavior

noisepage=# select
noisepage-#         s_acctbal
noisepage-# from
noisepage-#         part,
noisepage-#         partsupp,
noisepage-#         supplier,
noisepage-#         nation,
noisepage-#         region,
noisepage-#         customer,
noisepage-#         orders,
noisepage-#         lineitem
noisepage-# where
noisepage-#         p_partkey = ps_partkey
noisepage-#         and ps_suppkey = s_suppkey 
noisepage-#         and s_nationkey = n_nationkey
noisepage-#         and n_regionkey = r_regionkey
noisepage-#         and n_nationkey = c_nationkey
noisepage-#         and c_custkey = o_custkey
noisepage-#         and o_orderkey = l_orderkey
noisepage-# limit 100;
 s_acctbal 
-----------
(0 rows)

Actual Behavior

noisepage=# select
noisepage-#         s_acctbal
noisepage-# from
noisepage-#         part,
noisepage-#         partsupp,
noisepage-#         supplier,
noisepage-#         nation,
noisepage-#         region,
noisepage-#         customer,
noisepage-#         orders,
noisepage-#         lineitem
noisepage-# where
noisepage-#         p_partkey = ps_partkey
noisepage-#         and ps_suppkey = s_suppkey 
noisepage-#         and s_nationkey = n_nationkey
noisepage-#         and n_regionkey = r_regionkey
noisepage-#         and n_nationkey = c_nationkey
noisepage-#         and c_custkey = o_custkey
noisepage-#         and o_orderkey = l_orderkey;
noisepage-# limit 100;

This keeps running for a long time without ever timing out. I've never stuck around long enough to confirm that it ever finishes but I'm assuming that it does eventually. As you remove more and more tables from the FROM clause then it will start to terminate and execute faster.

@jkosh44
Copy link
Contributor Author

jkosh44 commented Dec 24, 2020

So is seems like my guess was probably somewhat right. In order to time out we need some low cost expression for the root group of our op tree. When we don't have the LIMIT clause then the root is a LogicalInnerJoin. When we do have the LIMIT clause then the root is a LogicalLimit. If we look at the task that actually adds costed expressions it's right here:

//===--------------------------------------------------------------------===//
// OptimizeExpressionCostWithEnforcedProperty
//===--------------------------------------------------------------------===//
void OptimizeExpressionCostWithEnforcedProperty::Execute() {
// Init logic: only run once per task
OPTIMIZER_LOG_TRACE("OptimizeExpressionCostWithEnforcedProperty::Execute() ");
if (cur_child_idx_ == -1) {
// TODO(patrick):
// 1. We can init input cost using non-zero value for pruning
// 2. We can calculate the current operator cost if we have maintain
// logical properties in group (e.g. stats, schema, cardinality)
cur_total_cost_ = 0;
// Pruning
if (cur_total_cost_ > context_->GetCostUpperBound()) return;
// Derive output and input properties
ChildPropertyDeriver prop_deriver;
output_input_properties_ = prop_deriver.GetProperties(context_->GetOptimizerContext()->GetCatalogAccessor(),
&context_->GetOptimizerContext()->GetMemo(),
context_->GetRequiredProperties(), group_expr_);
cur_child_idx_ = 0;
// TODO(patrick/boweic): If later on we support properties that may not be enforced in some
// cases, we can check whether it is the case here to do the pruning
}
// Loop over (output prop, input props) pair for the GroupExpression being optimized
// (1) Cost children (if needed); or pick the best child expression (in terms of cost)
// (2) Enforce any missing properties as required
// (3) Update Group/Context metadata of expression + cost
for (; cur_prop_pair_idx_ < static_cast<int>(output_input_properties_.size()); cur_prop_pair_idx_++) {
auto &output_prop = output_input_properties_[cur_prop_pair_idx_].first;
auto &input_props = output_input_properties_[cur_prop_pair_idx_].second;
// Calculate local cost and update total cost
if (cur_child_idx_ == 0) {
// Compute the cost of the root operator
// 1. Collect stats needed and cache them in the group
// 2. Calculate cost based on children's stats
cur_total_cost_ += context_->GetOptimizerContext()->GetCostModel()->CalculateCost(
context_->GetOptimizerContext()->GetTxn(), context_->GetOptimizerContext()->GetCatalogAccessor(),
&context_->GetOptimizerContext()->GetMemo(), group_expr_);
}
for (; cur_child_idx_ < static_cast<int>(group_expr_->GetChildrenGroupsSize()); cur_child_idx_++) {
auto &i_prop = input_props[cur_child_idx_];
auto child_group =
context_->GetOptimizerContext()->GetMemo().GetGroupByID(group_expr_->GetChildGroupId(cur_child_idx_));
// Check whether the child group is already optimized for the prop
auto child_best_expr = child_group->GetBestExpression(i_prop);
if (child_best_expr != nullptr) { // Directly get back the best expr if the child group is optimized
cur_total_cost_ += child_best_expr->GetCost(i_prop);
if (cur_total_cost_ > context_->GetCostUpperBound()) break;
} else if (prev_child_idx_ != cur_child_idx_) { // We haven't optimized child group
prev_child_idx_ = cur_child_idx_;
PushTask(new OptimizeExpressionCostWithEnforcedProperty(this));
auto cost_high = context_->GetCostUpperBound() - cur_total_cost_;
auto ctx = new OptimizationContext(context_->GetOptimizerContext(), i_prop->Copy(), cost_high);
PushTask(new OptimizeGroup(child_group, ctx));
context_->GetOptimizerContext()->AddOptimizationContext(ctx);
return;
} else { // If we return from OptimizeGroup, then there is no expr for the context
break;
}
}
// TODO(wz2): Can we reduce the amount of copying
// Check whether we successfully optimize all child group
if (cur_child_idx_ == static_cast<int>(group_expr_->GetChildrenGroupsSize())) {
// Not need to do pruning here because it has been done when we get the
// best expr from the child group
// Add this group expression to group expression hash table
std::vector<PropertySet *> input_props_copy;
input_props_copy.reserve(input_props.size());
for (auto i_prop : input_props) {
input_props_copy.push_back(i_prop->Copy());
}
group_expr_->SetLocalHashTable(output_prop->Copy(), input_props_copy, cur_total_cost_);
auto cur_group = GetMemo().GetGroupByID(group_expr_->GetGroupID());
cur_group->SetExpressionCost(group_expr_, cur_total_cost_, output_prop->Copy());
// Enforce property if the requirement does not meet
PropertyEnforcer prop_enforcer;
GroupExpression *memo_enforced_expr = nullptr;
bool meet_requirement = true;
// TODO(patrick/boweic): For now, we enforce the missing properties in the order of how we
// find them. This may miss the opportunity to enforce them or may lead to
// sub-optimal plan. This is fine now because we only have one physical
// property (sort). If more properties are added, we should add some heuristics
// to derive the optimal enforce order or perform a cost-based full enumeration.
for (auto &prop : context_->GetRequiredProperties()->Properties()) {
if (!output_prop->HasProperty(*prop)) {
auto enforced_expr =
prop_enforcer.EnforceProperty(group_expr_, prop, context_->GetOptimizerContext()->GetTxn());
// Cannot enforce the missing property
if (enforced_expr == nullptr) {
meet_requirement = false;
break;
}
memo_enforced_expr = GetMemo().InsertExpression(enforced_expr, group_expr_->GetGroupID(), true);
// Extend the output properties after enforcement
auto pre_output_prop_set = output_prop->Copy();
// Cost the enforced expression
auto extended_prop_set = output_prop->Copy();
extended_prop_set->AddProperty(prop->Copy());
cur_total_cost_ += context_->GetOptimizerContext()->GetCostModel()->CalculateCost(
context_->GetOptimizerContext()->GetTxn(), context_->GetOptimizerContext()->GetCatalogAccessor(),
&context_->GetOptimizerContext()->GetMemo(), memo_enforced_expr);
// Update hash tables for group and group expression
memo_enforced_expr->SetLocalHashTable(extended_prop_set, {pre_output_prop_set}, cur_total_cost_);
cur_group->SetExpressionCost(memo_enforced_expr, cur_total_cost_, output_prop->Copy());
}
}
// Can meet the requirement
if (meet_requirement && cur_total_cost_ <= context_->GetCostUpperBound()) {
// If the cost is smaller than the winner, update the context upper bound
context_->SetCostUpperBound(context_->GetCostUpperBound() - cur_total_cost_);
if (memo_enforced_expr != nullptr) { // Enforcement takes place
cur_group->SetExpressionCost(memo_enforced_expr, cur_total_cost_, context_->GetRequiredProperties()->Copy());
} else if (output_prop->Properties().size() != context_->GetRequiredProperties()->Properties().size()) {
// The original output property is a super set of the requirement
cur_group->SetExpressionCost(group_expr_, cur_total_cost_, context_->GetRequiredProperties()->Copy());
}
}
}
// Reset child idx and total cost
prev_child_idx_ = -1;
cur_child_idx_ = 0;
cur_total_cost_ = 0;
}
}

Specifically the calls to SetExpressionCost. However according to this part

   // Check whether we successfully optimize all child group
    if (cur_child_idx_ == static_cast<int>(group_expr_->GetChildrenGroupsSize())) {

we don't generate any cost for the current node until all child nodes are fully optimized and costed. So I think this means that the LogicalLimit node must wait for the LogicalInnerJoin node to be fully optimized before coming up with any cost for itself. So we can't time out until the LogicalInnerJoin is fully costed, and as the number of joins goes up this can take an extremely long time.

@jkosh44
Copy link
Contributor Author

jkosh44 commented Dec 24, 2020

One suggestion that @thepinetree had was to move the the LIMIT into a property. That way there wouldn't actually be a LIMIT node and the join node would be the root and it could properly time out. Sorts are handled as properties so the implementation would probably be similar. If we go with this approach we may want to wait until after #1031 #1422, because that PR is also making some changes to LIMITs.

I don't know if it's possible to reproduce this issue without limit, but I think a related issue to this is how the timeout works in general. The timeout in the optimizer is really just a timeout on the root node, not the entire op tree. So if some inner node takes a long time to optimize, then we have no way of timing out. If this issue exists with other queries then we may want to consider having functionality to time out inner nodes if they take too long and not just the root node. If that's even possible.

These are things we should probably discuss at the next meeting.

@jkosh44 jkosh44 self-assigned this Dec 29, 2020
@jkosh44 jkosh44 added the bug Something isn't working (correctness). Mark issues with this. label Dec 29, 2020
@jkosh44 jkosh44 removed their assignment Jan 11, 2021
@jkosh44
Copy link
Contributor Author

jkosh44 commented Jan 12, 2021

I was thinking about this today and came up with one possible solution. Instead of waiting for the child to be completely costed, we could just use the best cost available for the child. So when we check to see if the child is done with costing, we also check to see if there is any cost available for the child. If there is some cost, we cost the parent using the child's best cost so far and then we push the parent task back on the task stack so we can cost again when the child is finished or has more costs available. That way when we reach the time out we are more likely to have some cost for the root node, even if it's not the best cost.

This is also more to address the underlying problem of having to wait for all inner nodes to be fully costed before timing out. It still may make sense to convert LIMIT to a property.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working (correctness). Mark issues with this.
Projects
None yet
Development

No branches or pull requests

1 participant