-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extend query support in leveled #433
Comments
#434 - extends performance testing to understand the impact of potential changes. For regex queries, within this test the time to apply the regular expression dominates:
The regex query test will filter out and return 1:170 results. On M1 Mac:
Thats an average of 75ms per query to return about 300 results having scanned 50K entries. |
Comparison if re2 is plugged into leveled:
So re2 at 0.65 microseconds per call is 75% faster than PCRE regex in OTP 26.
Thats an average of 48ms per query to return about 300 results having scanned 50K entries. |
#435 - switches to use a NIF of. the re2 library taken from https://github.com/dukesoferl/re2 |
Currently in Riak/leveled query support is based on range queries with the application of regular expressions to allow filtering of additional attributes which have been concatenated to the term.
The following extensions are proposed:
capture_filter
where a conditions can be applied to a captured output, in particular >, >=, <, <= comparisons. These are very tricky to achieve in regular expressions - so instead capture in the regular expression and apply an additional capture filter to the captured output if it exists.return_terms
,return_count
,return_count_unique
andreturn_group_count
. Thereturn_count
would return the count of matches (without de-duplication on key), whereasreturn_count_unique
would return the count of unique key matches. Thereturn_group_count
would return the count by unique items in a given set of captured outputs (i.e. the regular expression could capture YearOfBirth and Status - and return the count by Status and YearOfBirth).The dependency on regular expressions raises potential issues with regards to performance. Regular expression queries should be added to perf_SUITE. There is a proposal to change the erlang re library to google re2 in OTP 28, with some potentially significant performance benefits. It may be worth experimenting with a RE2 NIF in preparation.
The text was updated successfully, but these errors were encountered: