Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster Expression parser / Tokenizer with StringScanner #1844

Open
wants to merge 25 commits into
base: main
Choose a base branch
from

Conversation

ggmichaelgo
Copy link
Contributor

@ggmichaelgo ggmichaelgo commented Nov 2, 2024

Faster Tokenizer & Expression Parser

Inspired by @tenderlove 's Rails World 2024 Closing Keynote and
@ianks 's #1832 PR ❤️
This PR updates the Expression Parser and Tokenizer to use the new StringScanner to improve the performance.

Faster Expression Parser!

Here are the performance improvements by using Expression2

Expression Type Baseline (iter/s) Optimized (iter/s) Improvement (%)
String 1,642,330.4 1,618,752.0 same-ish
Literal 464,185.8 1,439,419.0 210% faster
Variable 170,784.0 1,130,120.6 562% faster
Number 142,954.2 385,356.9 170% faster
Range 57,236.9 570,767.7 897% faster
All 27,806.2 115,904.0 317% faster

Benchmark Improvement

main branch (baseline)

ruby 3.4.0dev (2024-04-23T16:59:11Z v3.4.0-pshopify-pr.. 70b931820e) +YJIT [arm64-darwin23]
Warming up --------------------------------------
              parse:    14.000 i/100ms
             render:    70.000 i/100ms
     parse & render:    11.000 i/100ms
Calculating -------------------------------------
              parse:    142.958 (± 1.4%) i/s    (7.00 ms/i) -      2.870k in  20.081552s
             render:    704.996 (± 1.7%) i/s    (1.42 ms/i) -     14.140k in  20.064078s
     parse & render:    114.978 (± 0.9%) i/s    (8.70 ms/i) -      2.310k in  20.091266s

fast-expression-parse branch

ruby 3.4.0dev (2024-04-23T16:59:11Z v3.4.0-pshopify-pr.. 70b931820e) +YJIT [arm64-darwin23]
Warming up --------------------------------------
           tokenize:   356.000 i/100ms
              parse:    18.000 i/100ms
             render:    70.000 i/100ms
     parse & render:    14.000 i/100ms
Calculating -------------------------------------
           tokenize:      3.522k (± 1.6%) i/s  (283.89 μs/i) -     70.488k in  20.017506s
              parse:    188.257 (± 0.5%) i/s    (5.31 ms/i) -      3.780k in  20.080262s
             render:    708.969 (± 0.7%) i/s    (1.41 ms/i) -     14.210k in  20.044211s
     parse & render:    142.434 (± 2.1%) i/s    (7.02 ms/i) -      2.856k in  20.065080s

This PR improves the parsing speed by ~24%, and parse & render speed by ~19%.

More Benchmark Results!

Here is another benchmarking with a larger and complex theme:
image

By using the LRU caching, Liquid is creating ~20% less objects, the parsing time is ~18% faster, and render time is ~9% faster.

With production traffic, we are seeing ~14% parsing time improvement:

  Baseline (staging-liquid-2) Experiment (staging-liquid-1) Diff
Average 19.9ms 17.4ms ~14% improvement
p99 82ms 71ms ~15% improvement
p95 52ms 45ms ~16% improvement
p90 42ms 36ms ~17% improvement
p75 28ms 24ms ~17% improvement
p50 17ms 15ms ~13% improvement

@ggmichaelgo ggmichaelgo force-pushed the fast-expression-parse branch 4 times, most recently from 7538bcb to e912bb0 Compare November 5, 2024 19:49
@ggmichaelgo ggmichaelgo marked this pull request as ready for review November 5, 2024 19:51
@ggmichaelgo ggmichaelgo force-pushed the fast-expression-parse branch 3 times, most recently from 10ef505 to 8759798 Compare November 8, 2024 01:41
@ggmichaelgo ggmichaelgo marked this pull request as draft November 13, 2024 15:50
@ggmichaelgo ggmichaelgo changed the title Fast expression parse Faster Expression parser / Tokenizer with StringScanner Nov 14, 2024
@ggmichaelgo ggmichaelgo marked this pull request as ready for review November 15, 2024 03:49
@ggmichaelgo ggmichaelgo requested review from ianks, a team and Youssef-Mak November 15, 2024 03:49

def parse_number(markup)
ss = string_scanner
ss.string = markup

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like it might be possible to make this method faster by not using string scanner and manually iterating over the bytes in the string. It looks like markup is a string and we're just setting it on the string scanner.

Just as a rough outline, something like this:

idx = 0
byte = markup.getbyte(idx)

return false if return false if byte != DASH && byte != DOT && (byte < ZERO || byte > NINE)

while (byte = markup.getbyte(idx) && idx += 1) # get the current byte, advance the cursor
  return false if byte != DOT && (byte < ZERO || byte > NINE)

 ...
end

It's kind of a pain to manage the cursor position yourself, but we'll actually create more JIT friendly code this way. I'd recommend using StringScanner only when we need to scan through a large buffer and use complex matches (IOW, regular expressions). We might be able to be even faster with StringScanner in the future, but for now I think managing a cursor yourself will be faster (in this case).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great catch!
I agree that StringScanner doesn't fit for this use case 👍

Unfortunately, I didn't get to see any performance improvement as much as I hoped to see... (roughly 2% diff by keeping the cursor in Ruby ☹️)

(10 seconds warmup / 20 seconds run)
Baseline

ruby 3.4.0dev (2024-04-23T16:59:11Z v3.4.0-pshopify-pr.. 70b931820e) +YJIT [arm64-darwin23]
Warming up --------------------------------------
Liquid::Expression2#parse: number
                        39.927k i/100ms
Calculating -------------------------------------
Liquid::Expression2#parse: number
                        405.480k (± 0.4%) i/s    (2.47 μs/i) -      8.145M in  20.087950s

Without StringScanner

ruby 3.4.0dev (2024-04-23T16:59:11Z v3.4.0-pshopify-pr.. 70b931820e) +YJIT [arm64-darwin23]
Warming up --------------------------------------
Liquid::Expression2#parse: number
                        39.051k i/100ms
Calculating -------------------------------------
Liquid::Expression2#parse: number
                        395.787k (± 0.3%) i/s    (2.53 μs/i) -      7.927M in  20.029537s
Source code
def parse_number(markup)
  # the first byte must be a digit, a period, or  a dash
  byte = markup.getbyte(0)
  return false if byte != DASH && byte != DOT && (byte < ZERO || byte > NINE)

  is_integer = true
  last_dot_pos = nil
  num_end_pos = nil
  idx = 1

  while byte = markup.getbyte(idx)
    return false if byte != DOT && (byte < ZERO || byte > NINE)

    idx += 1

    # we found our number and now we are just scanning the rest of the string
    next if num_end_pos

    if byte == DOT
      if is_integer == false
        num_end_pos = idx - 1
      else
        is_integer = false
        last_dot_pos = idx
      end
    end
  end

  num_end_pos = markup.length if idx == markup.length

  return markup.to_i if is_integer

  if num_end_pos
    # number ends with a number "123.123"
    markup.byteslice(0, num_end_pos).to_f
  elsif last_dot_pos
    markup.byteslice(0, last_dot_pos).to_f
  else
    # we should never reach this point
    false
  end
end

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to profile to find what's slow, but I think avoiding the mutation of StringScanner is a great improvement. Maybe try profiling with Vernier?

If I'm reading this correctly, it looks like this function requires that the whole string must contain 0-9 or a . character. We seem to exit early if return false if byte != DOT && (byte < ZERO || byte > NINE) happens.

If that's true, then we could do a fast path case that's like return markup.to_i if markup.match?(/\A[0-9]+\z/).

Something like this:

def parse_number(markup)
  # the first byte must be a digit, a period, or  a dash
  byte = markup.getbyte(0)
  return false if byte != DASH && byte != DOT && (byte < ZERO || byte > NINE)

  # This handles all integer cases
  return markup.to_i if markup.match?(/\A[0-9]+\z/)

  # rest of method
end

It might be possible to do the float parsing without the loop, but I don't really understand what it's doing. It looks like it's trying to handle a number with multiple dots in it? I don't understand the case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like it's trying to handle a number with multiple dots in it? I don't understand the case.
Yup, since "123.456.789".to_f returns 123.456 in Ruby, we need to replicate that logic in Liquid.

The original Tokenizer actually parses the number with the Regexs:

...
case markup
when INTEGERS_REGEX
  Regexp.last_match(1).to_i
when RANGES_REGEX
  RangeLookup.parse(Regexp.last_match(1), Regexp.last_match(2))
when FLOATS_REGEX
  Regexp.last_match(1).to_f
else
...

with this approach of using Regex, we are scanning the String twice for parsing a number. (Once with INTEGERS_REGEX and another one with FLOATS_REGEX)
By scanning the String once, Tokenizer2 is 1.7x faster than Tokenizer1 for parsing numbers.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, since "123.456.789".to_f returns 123.456 in Ruby, we need to replicate that logic in Liquid.

Do we really need to replicate all logic from Ruby? For example "1e10".to_f works in Ruby, but it's not supported here. Also, if it's following the same rules that the lexer uses, I don't think we would encounter a :number token that contains multiple .:

$ ruby -I lib -r liquid/lexer -e'p Liquid::Lexer.new("12.34.56").tokenize'
[[:number, "12.34"], [:dot, "."], [:number, "56"], [:end_of_string]]

Are these strings coming from the lexer or somewhere else?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need to replicate all logic from Ruby?

You are right, we don't always follow the Ruby's number parsing logic... 😅

So... Liquid has 2 parsing modes, lax and strict parsing mode. (lax is our legacy parser)

In strict mode, templates like {{ 123.123.123 }} gets tokenized by Lexer#tokenize first, and it will raise an error

Liquid syntax error: Expected end_of_string but found dot in "{{ 123.123.123 }}"

However, in lax mode, {{ 123.123.123 }} template gets parsed by Expression#parse without using the Lexer.

Since the Expression class' FLOATS_REGEX is /\A(-?\d[\d\.]+)\z/, it allows multiple dots in the float :(
image


class << self
def string_scanner
@string_scanner ||= StringScanner.new("")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a problem creating a new StringScanner object each time we need one? It seems like we're just caching the object and then mutating it later. That could lead to thread safety problems down the road. Usually I like to create one StringScanner object per buffer I'm parsing, which should amortize the cost of the allocation as long as it's reused.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah... I am also worried about thread safety problem in the future :(
In local testing, we see around ~5.6% performance improvement, but in production, we were seeing performance regression due to increased GC calls.

Also, allocating a new StringScanner object was more expensive than I have thought 🤔
For ParseContext#parse_expression`, allocating a new StringScanner took 6.7% of the total time:
image

For thread safety in the future, we could create a pool of StringScanner later 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For those, who uses web servers like Puma, recycling the StringScanner would become problematic... I just added StringScannerPool to this PR 👍
Thank you for catching this!

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For ParseContext#parse_expression`, allocating a new StringScanner took 6.7% of the total time

This is a recursive descent parser though right? Meaning parse_expression will recursively call parse_expression? In other words, does parse_expression get called many times per document?

Ideally we would only create one StringScanner object per document we parse.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup - parse_expression is a recursive function. The function always finish using its StringScanner before it calls itself, so we are safe to reuse the StringScanner.
The PR has been updated to create one StringScanner object per document which is stored in the ParseContext 👍

INTEGER_REGEX = /\A(-?\d+)\z/
FLOAT_REGEX = /\A(-?\d+)\.\d+\z/

CACHE = LruRedux::Cache.new(10_000) # most themes would have less than 2,000 unique expression
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we get 90% of the benefit by hardcoding a few common expressions? I'm not sure how I feel about a global cache like this given the objects aren't frozen.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah... I could see how it could go wrong... 😓
I have updated the PR to create an Expression cache per document parsing.
The caches do fill up quickly, and the cache hit rate stays relatively high (~60% for Dawn)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: We will have to update Liquid-C to actually provide Expression cache
Shopify/liquid-c#227

lib/liquid/lexer.rb Outdated Show resolved Hide resolved
@@ -176,7 +180,7 @@ def []=(key, value)
# Example:
# products == empty #=> products.empty?
def [](expression)
evaluate(Expression.parse(expression))
evaluate(Expression.parse(expression, @string_scanner))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Side note: I think Context#[] should just be deprecated honestly, it's a very deceptive performance footgun.

Instead, we can add a new method: Context#parse_and_evaluate

# constructing new StringScanner in Lexer, Tokenizer, etc is expensive
# This StringScanner will be shared by all of them
@string_scanner = StringScanner.new("")
@expression_cache = LruRedux::Cache.new(10_000)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make this an option on the parse context

Suggested change
@expression_cache = LruRedux::Cache.new(10_000)
@expression_cache = options[:expression_cache]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's also have default expression caching option.

@expression_cache = if options[:expression_cache] != false
  options[:expression_cache] || LruRedux::Cache.new(10_000)
end

With this way, people can re-use their expression caches between templates or disable the expression caching.

lib/liquid/expression.rb Outdated Show resolved Hide resolved
lib/liquid/expression.rb Outdated Show resolved Hide resolved
lib/liquid/expression.rb Outdated Show resolved Hide resolved
@ss.pos -= 1
@output << [:number, @ss.scan(NUMBER_LITERAL)]
class << self
def tokenize(input, ss)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had a search on github, didn't see any references to Liquid::Lexer.new so this change should be fine

def initialize(input)
l = Lexer.new(input)
@tokens = l.tokenize
def initialize(input, string_scanner)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def initialize(input, string_scanner)
def initialize(input)
@ss = input.is_a?(String) ? StringScanner.new(input) : input

lib/liquid/parse_context.rb Outdated Show resolved Hide resolved
def new_tokenizer(markup, start_line_number: nil, for_liquid_tag: false)
Tokenizer.new(markup, line_number: start_line_number, for_liquid_tag: for_liquid_tag)
def new_parser(input)
Parser.new(input, @string_scanner)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Parser.new(input, @string_scanner)
@string_scanner.string = input
Parser.new(@string_scanner)

@@ -176,7 +180,7 @@ def []=(key, value)
# Example:
# products == empty #=> products.empty?
def [](expression)
evaluate(Expression.parse(expression))
evaluate(Expression.parse(expression, @string_scanner))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also cache here

'false' => false,
'blank' => '',
'empty' => '',
'-' => VariableLookup.parse("-", nil),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you comment why we need this??


# Cache only exists during parsing
if cache
return cache[markup] if cache.key?(markup)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return cache[markup] if cache.key?(markup)
cache.fetch(markup) { inner_parse(markup, ss, cache) }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unfortunately, LruRedux::Cache#fetch does not set the value when the key is missing :|
I am going to keep the current version to be more compatible
https://github.com/SamSaffron/lru_redux/blob/037ee594aded59764487d4a1a5b4077fa710ae3b/lib/lru_redux/cache.rb#L40-L48

lib/liquid/expression.rb Outdated Show resolved Hide resolved
Copy link
Contributor

@ianks ianks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good work so far, getting there!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants