-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
remove Ascii class #89
remove Ascii class #89
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your implementation of containsIgnoreCase()
is indeed simpler, but it's slower than the current implementation.
To verify this, I benchmarked the current implementation, vs your implementation and vs s1.toLowerCase().contains(s2.toLowerCase())
.
The current implementation was the fastest, with your implementation being 3x slower (up to 5.3x) and the last option being 3.4x slower (up to 7.4x) on average.
Since this method is being called quite a lot during search, I don't think it would be a good tradeoff in my opinion, since it may slow down the search considerably, especially when there are a lot of settings.
I do however agree that we can remove the unused methods, since we don't really need them. We originally left the method in the Ascii
class intentionally because we were waiting for google/guava#3023 to be merged, and it would've made the refactoring a lot easier, since we only needed to remove the class itself from here and the imports, and wouldn't need to refactor the rest. But since it takes a long time for it to get merged and we don't have the Guava dependency anymore anyways I don't think that aspect is really important anymore.
My suggestion would be to revert the simplification to the original implementation and keep all of the other changes. This would mean the Ascii
class will still be gone, but the implementation of containsIgnoreCase
will be present in Strings
. What do you think?
Maybe the difference in performance is explainable by a difference in feature.
This causes the test to fail as the current implementation only supports case insensitive lookup for the 26 letters in the English alphabet. Other letters such as the German umlaut or French, Norwegian, Spanish special characters are not compared in a case insensitive manner. The new implementation uses the Java built in functionality for comparing strings in a case insensitive manner an thus also supports characters other than the A-Z. If you weight the performance higher than the support of non English characters than I wil restore the algorithm from the Ascii class. |
I see.. This changes the perspective of course. Let's keep this implementation then, and in case there are reports about slow performance, we would probably need to introduce some sort of caching to make the search quicker. I added an issue #94. |
Most of the code in the class
Ascii
was never used.