Faster Expression parser / Tokenizer with StringScanner #1844

ggmichaelgo · 2024-11-02T00:27:28Z

Faster Tokenizer & Expression Parser

Inspired by @tenderlove 's Rails World 2024 Closing Keynote and
@ianks 's #1832 PR ❤️
This PR updates the Expression Parser and Tokenizer to use the new StringScanner to improve the performance.

Faster Expression Parser!

Here are the performance improvements by using Expression2

Expression Type	Baseline (iter/s)	Optimized (iter/s)	Improvement (%)
String	1,642,330.4	1,618,752.0	same-ish
Literal	464,185.8	1,439,419.0	210% faster
Variable	170,784.0	1,130,120.6	562% faster
Number	142,954.2	385,356.9	170% faster
Range	57,236.9	570,767.7	897% faster
All	27,806.2	115,904.0	317% faster

Benchmark Improvement

main branch (baseline)

ruby 3.4.0dev (2024-04-23T16:59:11Z v3.4.0-pshopify-pr.. 70b931820e) +YJIT [arm64-darwin23]
Warming up --------------------------------------
              parse:    14.000 i/100ms
             render:    70.000 i/100ms
     parse & render:    11.000 i/100ms
Calculating -------------------------------------
              parse:    142.958 (± 1.4%) i/s    (7.00 ms/i) -      2.870k in  20.081552s
             render:    704.996 (± 1.7%) i/s    (1.42 ms/i) -     14.140k in  20.064078s
     parse & render:    114.978 (± 0.9%) i/s    (8.70 ms/i) -      2.310k in  20.091266s

fast-expression-parse branch

ruby 3.4.0dev (2024-04-23T16:59:11Z v3.4.0-pshopify-pr.. 70b931820e) +YJIT [arm64-darwin23]
Warming up --------------------------------------
           tokenize:   356.000 i/100ms
              parse:    18.000 i/100ms
             render:    70.000 i/100ms
     parse & render:    14.000 i/100ms
Calculating -------------------------------------
           tokenize:      3.522k (± 1.6%) i/s  (283.89 μs/i) -     70.488k in  20.017506s
              parse:    188.257 (± 0.5%) i/s    (5.31 ms/i) -      3.780k in  20.080262s
             render:    708.969 (± 0.7%) i/s    (1.41 ms/i) -     14.210k in  20.044211s
     parse & render:    142.434 (± 2.1%) i/s    (7.02 ms/i) -      2.856k in  20.065080s

This PR improves the parsing speed by ~24%, and parse & render speed by ~19%.

More Benchmark Results!

Here is another benchmarking with a larger and complex theme:

By using the LRU caching, Liquid is creating ~20% less objects, the parsing time is ~18% faster, and render time is ~9% faster.

With production traffic, we are seeing ~14% parsing time improvement:

Baseline (staging-liquid-2) Experiment (staging-liquid-1) Diff

Average 19.9ms 17.4ms ~14% improvement

p99 82ms 71ms ~15% improvement

p95 52ms 45ms ~16% improvement

p90 42ms 36ms ~17% improvement

p75 28ms 24ms ~17% improvement

p50 17ms 15ms ~13% improvement

tenderlove · 2024-11-17T18:20:39Z

lib/liquid/expression.rb

+
+      def parse_number(markup)
+        ss = string_scanner
+        ss.string = markup


It seems like it might be possible to make this method faster by not using string scanner and manually iterating over the bytes in the string. It looks like markup is a string and we're just setting it on the string scanner.

Just as a rough outline, something like this:

idx = 0 byte = markup.getbyte(idx) return false if return false if byte != DASH && byte != DOT && (byte < ZERO || byte > NINE) while (byte = markup.getbyte(idx) && idx += 1) # get the current byte, advance the cursor return false if byte != DOT && (byte < ZERO || byte > NINE) ... end

It's kind of a pain to manage the cursor position yourself, but we'll actually create more JIT friendly code this way. I'd recommend using StringScanner only when we need to scan through a large buffer and use complex matches (IOW, regular expressions). We might be able to be even faster with StringScanner in the future, but for now I think managing a cursor yourself will be faster (in this case).

Great catch!
I agree that StringScanner doesn't fit for this use case 👍

Unfortunately, I didn't get to see any performance improvement as much as I hoped to see... (roughly 2% diff by keeping the cursor in Ruby ☹️)

(10 seconds warmup / 20 seconds run)
Baseline

ruby 3.4.0dev (2024-04-23T16:59:11Z v3.4.0-pshopify-pr.. 70b931820e) +YJIT [arm64-darwin23] Warming up -------------------------------------- Liquid::Expression2#parse: number 39.927k i/100ms Calculating ------------------------------------- Liquid::Expression2#parse: number 405.480k (± 0.4%) i/s (2.47 μs/i) - 8.145M in 20.087950s

Without StringScanner

ruby 3.4.0dev (2024-04-23T16:59:11Z v3.4.0-pshopify-pr.. 70b931820e) +YJIT [arm64-darwin23] Warming up -------------------------------------- Liquid::Expression2#parse: number 39.051k i/100ms Calculating ------------------------------------- Liquid::Expression2#parse: number 395.787k (± 0.3%) i/s (2.53 μs/i) - 7.927M in 20.029537s

Source code

def parse_number(markup) # the first byte must be a digit, a period, or a dash byte = markup.getbyte(0) return false if byte != DASH && byte != DOT && (byte < ZERO || byte > NINE) is_integer = true last_dot_pos = nil num_end_pos = nil idx = 1 while byte = markup.getbyte(idx) return false if byte != DOT && (byte < ZERO || byte > NINE) idx += 1 # we found our number and now we are just scanning the rest of the string next if num_end_pos if byte == DOT if is_integer == false num_end_pos = idx - 1 else is_integer = false last_dot_pos = idx end end end num_end_pos = markup.length if idx == markup.length return markup.to_i if is_integer if num_end_pos # number ends with a number "123.123" markup.byteslice(0, num_end_pos).to_f elsif last_dot_pos markup.byteslice(0, last_dot_pos).to_f else # we should never reach this point false end end

We need to profile to find what's slow, but I think avoiding the mutation of StringScanner is a great improvement. Maybe try profiling with Vernier?

If I'm reading this correctly, it looks like this function requires that the whole string must contain 0-9 or a . character. We seem to exit early if return false if byte != DOT && (byte < ZERO || byte > NINE) happens.

If that's true, then we could do a fast path case that's like return markup.to_i if markup.match?(/\A[0-9]+\z/).

Something like this:

def parse_number(markup) # the first byte must be a digit, a period, or a dash byte = markup.getbyte(0) return false if byte != DASH && byte != DOT && (byte < ZERO || byte > NINE) # This handles all integer cases return markup.to_i if markup.match?(/\A[0-9]+\z/) # rest of method end

It might be possible to do the float parsing without the loop, but I don't really understand what it's doing. It looks like it's trying to handle a number with multiple dots in it? I don't understand the case.

It looks like it's trying to handle a number with multiple dots in it? I don't understand the case.
Yup, since "123.456.789".to_f returns 123.456 in Ruby, we need to replicate that logic in Liquid.

The original Tokenizer actually parses the number with the Regexs:

... case markup when INTEGERS_REGEX Regexp.last_match(1).to_i when RANGES_REGEX RangeLookup.parse(Regexp.last_match(1), Regexp.last_match(2)) when FLOATS_REGEX Regexp.last_match(1).to_f else ...

with this approach of using Regex, we are scanning the String twice for parsing a number. (Once with INTEGERS_REGEX and another one with FLOATS_REGEX)
By scanning the String once, Tokenizer2 is 1.7x faster than Tokenizer1 for parsing numbers.

Yup, since "123.456.789".to_f returns 123.456 in Ruby, we need to replicate that logic in Liquid.

Do we really need to replicate all logic from Ruby? For example "1e10".to_f works in Ruby, but it's not supported here. Also, if it's following the same rules that the lexer uses, I don't think we would encounter a :number token that contains multiple .:

$ ruby -I lib -r liquid/lexer -e'p Liquid::Lexer.new("12.34.56").tokenize' [[:number, "12.34"], [:dot, "."], [:number, "56"], [:end_of_string]]

Are these strings coming from the lexer or somewhere else?

Do we really need to replicate all logic from Ruby?

You are right, we don't always follow the Ruby's number parsing logic... 😅

So... Liquid has 2 parsing modes, lax and strict parsing mode. (lax is our legacy parser)

In strict mode, templates like {{ 123.123.123 }} gets tokenized by Lexer#tokenize first, and it will raise an error

Liquid syntax error: Expected end_of_string but found dot in "{{ 123.123.123 }}"

However, in lax mode, {{ 123.123.123 }} template gets parsed by Expression#parse without using the Lexer.

Since the Expression class' FLOATS_REGEX is /\A(-?\d[\d\.]+)\z/, it allows multiple dots in the float :(

tenderlove · 2024-11-17T18:23:53Z

lib/liquid/tokenizer.rb

+
+    class << self
+      def string_scanner
+        @string_scanner ||= StringScanner.new("")


Is there a problem creating a new StringScanner object each time we need one? It seems like we're just caching the object and then mutating it later. That could lead to thread safety problems down the road. Usually I like to create one StringScanner object per buffer I'm parsing, which should amortize the cost of the allocation as long as it's reused.

Yeah... I am also worried about thread safety problem in the future :(
In local testing, we see around ~5.6% performance improvement, but in production, we were seeing performance regression due to increased GC calls.

Also, allocating a new StringScanner object was more expensive than I have thought 🤔
For ParseContext#parse_expression`, allocating a new StringScanner took 6.7% of the total time:

For thread safety in the future, we could create a pool of StringScanner later 🤔

For those, who uses web servers like Puma, recycling the StringScanner would become problematic... I just added StringScannerPool to this PR 👍
Thank you for catching this!

For ParseContext#parse_expression`, allocating a new StringScanner took 6.7% of the total time

This is a recursive descent parser though right? Meaning parse_expression will recursively call parse_expression? In other words, does parse_expression get called many times per document?

Ideally we would only create one StringScanner object per document we parse.

Yup - parse_expression is a recursive function. The function always finish using its StringScanner before it calls itself, so we are safe to reuse the StringScanner.
The PR has been updated to create one StringScanner object per document which is stored in the ParseContext 👍

ianks · 2024-11-21T19:55:47Z

lib/liquid/expression.rb

+    INTEGER_REGEX = /\A(-?\d+)\z/
+    FLOAT_REGEX = /\A(-?\d+)\.\d+\z/
+
+    CACHE = LruRedux::Cache.new(10_000) # most themes would have less than 2,000 unique expression


Could we get 90% of the benefit by hardcoding a few common expressions? I'm not sure how I feel about a global cache like this given the objects aren't frozen.

Yeah... I could see how it could go wrong... 😓
I have updated the PR to create an Expression cache per document parsing.
The caches do fill up quickly, and the cache hit rate stays relatively high (~60% for Dawn)

note: We will have to update Liquid-C to actually provide Expression cache
Shopify/liquid-c#227

lib/liquid/lexer.rb

ianks · 2024-11-21T20:06:16Z

lib/liquid/context.rb

@@ -176,7 +180,7 @@ def []=(key, value)
    # Example:
    #   products == empty #=> products.empty?
    def [](expression)
-      evaluate(Expression.parse(expression))
+      evaluate(Expression.parse(expression, @string_scanner))


Side note: I think Context#[] should just be deprecated honestly, it's a very deceptive performance footgun.

Instead, we can add a new method: Context#parse_and_evaluate

ianks · 2024-11-22T19:34:53Z

lib/liquid/parse_context.rb

+      # constructing new StringScanner in Lexer, Tokenizer, etc is expensive
+      # This StringScanner will be shared by all of them
+      @string_scanner = StringScanner.new("")
+      @expression_cache = LruRedux::Cache.new(10_000)


Let's make this an option on the parse context

Suggested change

@expression_cache = LruRedux::Cache.new(10_000)

@expression_cache = options[:expression_cache]

Let's also have default expression caching option.

@expression_cache = if options[:expression_cache] != false options[:expression_cache] || LruRedux::Cache.new(10_000) end

With this way, people can re-use their expression caches between templates or disable the expression caching.

lib/liquid/expression.rb

ianks · 2024-11-22T19:44:36Z

lib/liquid/lexer.rb

-              @ss.pos -= 1
-              @output << [:number, @ss.scan(NUMBER_LITERAL)]
+    class << self
+      def tokenize(input, ss)


Had a search on github, didn't see any references to Liquid::Lexer.new so this change should be fine

ianks · 2024-11-22T19:50:06Z

lib/liquid/parser.rb

-    def initialize(input)
-      l       = Lexer.new(input)
-      @tokens = l.tokenize
+    def initialize(input, string_scanner)


Suggested change

def initialize(input, string_scanner)

def initialize(input)

@ss = input.is_a?(String) ? StringScanner.new(input) : input

lib/liquid/parse_context.rb

ianks · 2024-11-22T19:51:44Z

lib/liquid/parse_context.rb

-    def new_tokenizer(markup, start_line_number: nil, for_liquid_tag: false)
-      Tokenizer.new(markup, line_number: start_line_number, for_liquid_tag: for_liquid_tag)
+    def new_parser(input)
+      Parser.new(input, @string_scanner)


Suggested change

Parser.new(input, @string_scanner)

@string_scanner.string = input

Parser.new(@string_scanner)

ianks · 2024-11-22T19:54:02Z

lib/liquid/context.rb

@@ -176,7 +180,7 @@ def []=(key, value)
    # Example:
    #   products == empty #=> products.empty?
    def [](expression)
-      evaluate(Expression.parse(expression))
+      evaluate(Expression.parse(expression, @string_scanner))


We should also cache here

ianks · 2024-11-22T19:55:38Z

lib/liquid/expression.rb

+      'false' => false,
+      'blank' => '',
+      'empty' => '',
+      '-' => VariableLookup.parse("-", nil),


Can you comment why we need this??

ianks · 2024-11-22T19:56:12Z

lib/liquid/expression.rb

+
+        # Cache only exists during parsing
+        if cache
+          return cache[markup] if cache.key?(markup)


Suggested change

return cache[markup] if cache.key?(markup)

cache.fetch(markup) { inner_parse(markup, ss, cache) }

unfortunately, LruRedux::Cache#fetch does not set the value when the key is missing :|
I am going to keep the current version to be more compatible
https://github.com/SamSaffron/lru_redux/blob/037ee594aded59764487d4a1a5b4077fa710ae3b/lib/lru_redux/cache.rb#L40-L48

lib/liquid/expression.rb

ianks

Good work so far, getting there!

ggmichaelgo force-pushed the fast-expression-parse branch 4 times, most recently from 7538bcb to e912bb0 Compare November 5, 2024 19:49

ggmichaelgo marked this pull request as ready for review November 5, 2024 19:51

ggmichaelgo force-pushed the fast-expression-parse branch 3 times, most recently from 10ef505 to 8759798 Compare November 8, 2024 01:41

ggmichaelgo marked this pull request as draft November 13, 2024 15:50

ggmichaelgo force-pushed the fast-expression-parse branch from 8759798 to 5eeb4c4 Compare November 13, 2024 20:09

use StringScanner to improve Expression Parsing and Tokenizer

70de7ff

ggmichaelgo force-pushed the fast-expression-parse branch from 5eeb4c4 to 70de7ff Compare November 13, 2024 20:11

ggmichaelgo added 2 commits November 14, 2024 00:39

decrease the size of Expression2 LRU cache to store 5 themes

c6d502e

fix quirky negative sign expression markup parsing

2a8b75b

ggmichaelgo force-pushed the fast-expression-parse branch from 8836e31 to 2a8b75b Compare November 14, 2024 05:04

fix cycle tag not resetting

e0d54cc

ggmichaelgo force-pushed the fast-expression-parse branch from 49e55a6 to e0d54cc Compare November 14, 2024 19:33

ggmichaelgo changed the title ~~Fast expression parse~~ Faster Expression parser / Tokenizer with StringScanner Nov 14, 2024

ggmichaelgo force-pushed the fast-expression-parse branch from 17f49c3 to 7419a44 Compare November 14, 2024 21:36

don't cache string expressions to be more perfomant

c0e45c1

ggmichaelgo force-pushed the fast-expression-parse branch from 7419a44 to c0e45c1 Compare November 14, 2024 21:38

don't cache literals for Expression parser

fc432e2

ggmichaelgo marked this pull request as ready for review November 15, 2024 03:49

ggmichaelgo requested review from ianks, a team and Youssef-Mak November 15, 2024 03:49

ggmichaelgo added 2 commits November 15, 2024 17:24

don't force strscan 3.1

4441767

code clean up

e6786f6

ggmichaelgo force-pushed the fast-expression-parse branch from 1c8460c to e6786f6 Compare November 15, 2024 21:24

tenderlove reviewed Nov 17, 2024

View reviewed changes

ianks reviewed Nov 21, 2024

View reviewed changes

initialize StringScanner with input for default

a9d6e13

ggmichaelgo force-pushed the fast-expression-parse branch from acf92c0 to 0e4da50 Compare November 22, 2024 02:03

keep the expression parse caching per document parsing

aa7f926

ggmichaelgo force-pushed the fast-expression-parse branch from 0e4da50 to aa7f926 Compare November 22, 2024 02:03