You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
one-more-re-nightmare splits a regular expression into a prefix string and the rest of the RE. The prefix is scanned for more efficiently using Boyer-Moore-Horspool, before entering the DFA body generated for the rest of the RE. (This technique is also used in cl-ppcre and GNU grep to my knowledge.)
It should be possible to allow the client to generate their own prefix scanning code. The interface to the BMH code generator provides most of the required information to generate code, with the lambda list (start length vector prefix aref-generator fail). The first three arguments are symbols naming internal variables o-m-r-n uses, then the prefix sequence, then the :aref-generator argument to compile-regular-expression, then some code to evaluate when there are no more matches.
The text was updated successfully, but these errors were encountered:
bbfa2cf provides an object for prefix code generation, but it is "bare" and the heuristics it uses are implemented using inheritance and call-next-method instead of a compiler object making decisions.
one-more-re-nightmare splits a regular expression into a prefix string and the rest of the RE. The prefix is scanned for more efficiently using Boyer-Moore-Horspool, before entering the DFA body generated for the rest of the RE. (This technique is also used in cl-ppcre and GNU grep to my knowledge.)
However, it is possible that BMH is not ideal for processing prefixes. Modern processors have vector arithmetic units which can perform many element comparisons at once, and it is not too hard to design a substring search which uses vectorised code. When combined with heuristics based on properties known about the vector to search, vectorised searching can be faster than clever serial searching algorithms.
It should be possible to allow the client to generate their own prefix scanning code. The interface to the BMH code generator provides most of the required information to generate code, with the lambda list
(start length vector prefix aref-generator fail)
. The first three arguments are symbols naming internal variables o-m-r-n uses, then the prefix sequence, then the:aref-generator
argument tocompile-regular-expression
, then some code to evaluate when there are no more matches.The text was updated successfully, but these errors were encountered: