forked from chapel-lang/chapel
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Avoid string copies by implementing replace with match (chapel-lang#2…
…4757) This PR optimizes the 'replace' methods in Regex. It avoids string copies that were coming from moving the data to and from a `std::string` when calling RE2's Replace/GlobalReplace. It does so by using our own implementation of Replace by calling RE2's Match function in the re2-interface.cc code so that we can construct a result buffer by repeatedly doubling & then provide that when constructing a Chapel string. While there, this removes the need for doReplaceAndCountSlow, since now that we have our own implementation of the replace function, we can handle a maximum number of matches right there. While here, I noticed a bug in the old implementation, which this change fixes. The bug: ``` chapel use Regex; writeln("bb".replace(new regex(""), "a"); // ababa writeln("bb".replace(new regex(""), "a", count=5)); // was aaaaabb but now ababa ``` Work on this PR also revealed another bug described in chapel-lang#24788. Fixing that is left as future work. How does it affect performance? These measurements were collected on my PC with `start_test -performance test/studies/shootout/submitted/regexredux*.chpl --numtrials 4`. | Benchmark | main | this PR | speedup | | ------------ | ------ | ------- | ---------- | | regexredux2 | 1.89 s | 1.55 s | 22% faster | | regexredux3 | 1.64 s | 1.29 s | 25% faster | Reviewed by @DanilaFe - thanks! - [x] full comm=none testing - [x] valgrind testing for test/regex
- Loading branch information
Showing
6 changed files
with
250 additions
and
112 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.