Skip to content

How to use MetricLogUtils

Toshiya Kobayashi edited this page Dec 24, 2020 · 11 revisions

This feature is available since Drools 7.41.0.Final by introducing a new module drools-metric

If you enable MetricLogUtils, it will log constraint evaluation count and elapsed time on Phreak Nodes. It will help you find a bottle-neck in your rules.

Example use case

https://github.com/tkobayas/MetricLogUtils/tree/master/use-cases/use-case-join

You may quickly notice that this rule is badly written. But imagine that you have thousands of rules and they are more complex. It would be difficult to find a suspicious rule just by reviewing.

rule "Collect expensive orders combination"
  when
    $c : Customer()
    $o1 : Order(customer == $c)
    $o2 : Order(customer == $c, id > $o1.id, price > $o1.price)
    $maxPrice : Integer() from accumulate (Order(customer == $c, $price : price), max($price))
    eval($o1.getPrice() > ($maxPrice - 50))
  then
    ...
end

Enable metric logging

Firstly, add drools-metric dependency to your project.

    <dependency>
      <groupId>org.drools</groupId>
      <artifactId>drools-metric</artifactId>
    </dependency>

NOTE: You can use drools-metric with kie-server by dropping drools-metric-xxx.jar (you can find it in public maven repo) into kie-server.war/WEB-INF/lib. In this case, please test with single thread (multi-threads will confuse your analysis). I recommend to test with standalone java like this example though.

Then you can enable MetricLogUtils by system property (-Ddrools.metric.logger.enabled=true). You can optionally configure logging threshold with microseconds (-Ddrools.metric.logger.threshold=100). Default is 500.

You should also enable trace level logging for org.drools.metric.util.MetricLogUtils.

  <logger name="org.drools.metric.util.MetricLogUtils" level="trace"/>

Note that the metric logging is not for production environment. You should use this in your test environment when you want to analyze/detect a bottle-neck.

ReteDumper

Firstly, this example dumps Rete nodes for later reference.

        ReteDumper reteDumper = new ReteDumper();
        reteDumper.dump(kbase);
[EntryPointNode(1) EntryPoint::DEFAULT ] on Partition(MAIN)
  [ObjectTypeNode(3)::EntryPoint::DEFAULT objectType=[ClassObjectType class=com.sample.Customer] expiration=-1ms ] on Partition(MAIN)
    [LeftInputAdapterNode(4)] on Partition(1) Ld 0 Li 0
      [JoinNode(6) - [ClassObjectType class=com.sample.Order]] <contraints=[customer == $c]>  on Partition(1) Ld 0 Li 0 Rd 22 Ri 22
        [JoinNode(7) - [ClassObjectType class=com.sample.Order]] <contraints=[customer == $c, id > $o1.id, price > $o1.price]>  on Partition(1) Ld -1 Li -1 Rd 22 Ri 22
          [ AccumulateNode(8) ] <contraints=[customer == $c], resultConstraints=[], resultBinder=[]>  on Partition(1) Ld -1 Li -1 Rd 18 Ri 18
            [EvalConditionNode(9)]: cond=com.sample.Rule_Collect_expensive_orders_combination930932360Eval1Invoker@ee2a6922] on Partition(1) Ld -1 Li -1
              [RuleTerminalNode(10): rule=Collect expensive orders combination] on Partition(1) d -1 i -1
  [ObjectTypeNode(5)::EntryPoint::DEFAULT objectType=[ClassObjectType class=com.sample.Order] expiration=-1ms ] on Partition(MAIN)
    [JoinNode(6) - [ClassObjectType class=com.sample.Order]] <contraints=[customer == $c]>  on Partition(1) Ld 0 Li 0 Rd 22 Ri 22
    [JoinNode(7) - [ClassObjectType class=com.sample.Order]] <contraints=[customer == $c, id > $o1.id, price > $o1.price]>  on Partition(1) Ld -1 Li -1 Rd 22 Ri 22
    [ AccumulateNode(8) ] <contraints=[customer == $c], resultConstraints=[], resultBinder=[]>  on Partition(1) Ld -1 Li -1 Rd 18 Ri 18
  [ObjectTypeNode(2)::EntryPoint::DEFAULT objectType=[ClassObjectType class=org.drools.core.reteoo.InitialFactImpl] expiration=-1ms ] on Partition(MAIN)

Also dumpAssociatedRules() dumps a list of nodes with associated rules. It will help you to find related rules from a problematic constraint/node.

        reteDumper.dumpAssociatedRules(kbase);
[LeftInputAdapterNode(4)] : [Collect expensive orders combination]
[ObjectTypeNode(3)::EntryPoint::DEFAULT objectType=[ClassObjectType class=com.sample.Customer] expiration=-1ms ] : [Collect expensive orders combination]
[JoinNode(7) - [ClassObjectType class=com.sample.Order]] : [Collect expensive orders combination]
[EntryPointNode(1) EntryPoint::DEFAULT ] : []
[ObjectTypeNode(2)::EntryPoint::DEFAULT objectType=[ClassObjectType class=org.drools.core.reteoo.InitialFactImpl] expiration=-1ms ] : []
[RuleTerminalNode(10): rule=Collect expensive orders combination] : [Collect expensive orders combination]
[EvalConditionNode(9)]: cond=com.sample.Rule_Collect_expensive_orders_combination930932360Eval1Invoker@ee2a6922] : [Collect expensive orders combination]
[ObjectTypeNode(5)::EntryPoint::DEFAULT objectType=[ClassObjectType class=com.sample.Order] expiration=-1ms ] : [Collect expensive orders combination]
[ AccumulateNode(8) ] : [Collect expensive orders combination]
[JoinNode(6) - [ClassObjectType class=com.sample.Order]] : [Collect expensive orders combination]

Analysis: Locate a suspicious rule

JoinTest inserts 10 Customer objects and 100 Order objects per Customer (So 1000 Order in total).

When you run JoinTest, you will see logs like this:

2020-08-03 12:27:47,592 [main] TRACE [JoinNode(6) - [ClassObjectType class=com.sample.Order]], evalCount:1000, elapsedMicro:7654
2020-08-03 12:27:47,799 [main] TRACE [JoinNode(7) - [ClassObjectType class=com.sample.Order]], evalCount:100000, elapsedMicro:205274
2020-08-03 12:27:49,077 [main] TRACE [ AccumulateNode(8) ], evalCount:4950000, elapsedMicro:1277578
2020-08-03 12:27:49,114 [main] TRACE [EvalConditionNode(9)]: cond=com.sample.Rule_Collect_expensive_orders_combination930932360Eval1Invoker@ee2a6922], evalCount:49500, elapsedMicro:36368
  -> elapsed time (ms) : 1596
result.size() = 100

You find that [ AccumulateNode(8) ], evalCount:4950000, elapsedMicro:1277578 is outstanding.

Let's go back to dumpAssociatedRules() output. [ AccumulateNode(8) ] is associated to a rule [Collect expensive orders combination]. So we should review the rule. You may quickly find what's wrong in the rule but the log will help further.

Analysis: Get more hints

If evalCount is unusually large, we should look at the previous node's log. It's [JoinNode(7) - [ClassObjectType class=com.sample.Order]], evalCount:100000, elapsedMicro:205274. Accumulate is evaluated for every $o1/$o2 combination which the JoinNode produced. Okay, that's wrong. This accumulate is just to calculate max price of orders of a customer. So you should evaluate it just once per customer. Let's move it right after $c : Customer().

  when
    $c : Customer()
    $maxPrice : Integer() from accumulate (Order(customer == $c, $price : price), max($price))
    $o1 : Order(customer == $c)
    $o2 : Order(customer == $c, id > $o1.id, price > $o1.price)
    eval($o1.getPrice() > ($maxPrice - 50))

Then, run again.

2020-08-03 12:27:17,551 [main] TRACE [ AccumulateNode(6) ], evalCount:1000, elapsedMicro:16533
2020-08-03 12:27:17,557 [main] TRACE [JoinNode(7) - [ClassObjectType class=com.sample.Order]], evalCount:1000, elapsedMicro:3954
2020-08-03 12:27:17,742 [main] TRACE [JoinNode(8) - [ClassObjectType class=com.sample.Order]], evalCount:100000, elapsedMicro:184526
2020-08-03 12:27:17,764 [main] TRACE [EvalConditionNode(9)]: cond=com.sample.Rule_Collect_expensive_orders_combination930932360Eval1Invoker@ee2a6922], evalCount:49500, elapsedMicro:21321
  -> elapsed time (ms) : 285
result.size() = 100

Okay, it's much faster (AccumulateNode's evalCount:1000 is right. It has to evaluate 100 orders per customer anyway). But don't you want further improvement? [JoinNode(8)] evalCount:100000 and [EvalConditionNode(9)] evalCount:49500 still look significant. This "100000" suggests that cross-product is happening (remember that 100 Order objects are inserted per Customer). Remind the golden rule "List the most restrictive rule conditions first". This constraint ($o1.getPrice() > ($maxPrice - 50)) is very restrictive so we should evaluate it earlier. Also note that field constraint is preferred over eval() because field constraint will have more chance to get optimized by engine.

So finally, the rule is:

  when
    $c : Customer()
    $maxPrice : Integer() from accumulate (Order(customer == $c, $price : price), max($price))
    $o1 : Order(customer == $c, price > ($maxPrice - 50))
    $o2 : Order(customer == $c, id > $o1.id, price > $o1.price)

Then, run again.

2020-08-03 12:25:47,304 [main] TRACE [ AccumulateNode(6) ], evalCount:1000, elapsedMicro:17837
2020-08-03 12:25:47,395 [main] TRACE [JoinNode(7) - [ClassObjectType class=com.sample.Order]], evalCount:1000, elapsedMicro:89392
2020-08-03 12:25:47,429 [main] TRACE [JoinNode(8) - [ClassObjectType class=com.sample.Order]], evalCount:5000, elapsedMicro:33830
  -> elapsed time (ms) : 195
result.size() = 100

Now we get additional 32% performance gain :)

NOTE: This example test doesn't do "warm-up" so the measured time (= first-run) is quite slower than long-run average time (Drools engine does lots of optimization during long-run). This procedure is still valid to improve ineffective rules but you can also analyze metric log of long-run test.