remove Ascii class #89

sclassen · 2019-08-22T21:15:55Z

Most of the code in the class Ascii was never used.

Remove unused code
Simplify implementation of only remaining method
Move all methods from Ascii -> Strings
Add support for comparing characters outside of A-Z in a case insensitive way (for example äöü)

martinfrancois

Your implementation of containsIgnoreCase() is indeed simpler, but it's slower than the current implementation.
To verify this, I benchmarked the current implementation, vs your implementation and vs s1.toLowerCase().contains(s2.toLowerCase()).
The current implementation was the fastest, with your implementation being 3x slower (up to 5.3x) and the last option being 3.4x slower (up to 7.4x) on average.

Since this method is being called quite a lot during search, I don't think it would be a good tradeoff in my opinion, since it may slow down the search considerably, especially when there are a lot of settings.

I do however agree that we can remove the unused methods, since we don't really need them. We originally left the method in the Ascii class intentionally because we were waiting for google/guava#3023 to be merged, and it would've made the refactoring a lot easier, since we only needed to remove the class itself from here and the imports, and wouldn't need to refactor the rest. But since it takes a long time for it to get merged and we don't have the Guava dependency anymore anyways I don't think that aspect is really important anymore.

My suggestion would be to revert the simplification to the original implementation and keep all of the other changes. This would mean the Ascii class will still be gone, but the implementation of containsIgnoreCase will be present in Strings. What do you think?

sclassen · 2019-08-25T09:40:59Z

Maybe the difference in performance is explainable by a difference in feature.
I added the following line to AsciiTest in the master branch

assertTrue(Ascii.containsIgnoreCase("Übung", "übung"));

This causes the test to fail as the current implementation only supports case insensitive lookup for the 26 letters in the English alphabet. Other letters such as the German umlaut or French, Norwegian, Spanish special characters are not compared in a case insensitive manner.

The new implementation uses the Java built in functionality for comparing strings in a case insensitive manner an thus also supports characters other than the A-Z.

If you weight the performance higher than the support of non English characters than I wil restore the algorithm from the Ascii class.

martinfrancois · 2019-08-25T11:17:31Z

I see.. This changes the perspective of course. Let's keep this implementation then, and in case there are reports about slow performance, we would probably need to introduce some sort of caching to make the search quicker. I added an issue #94.

sclassen added 2 commits August 22, 2019 23:12

simplified Ascii class

3cafb39

move containsIgnoreCase from Ascii -> Strings

d2e4dac

sclassen mentioned this pull request Aug 22, 2019

fix javadoc errors #90

Merged

martinfrancois suggested changes Aug 24, 2019

View reviewed changes

sclassen changed the title ~~simplified Ascii class~~ remove Ascii class Aug 25, 2019

martinfrancois approved these changes Aug 25, 2019

View reviewed changes

martinfrancois merged commit f46dd57 into dlsc-software-consulting-gmbh:master Aug 25, 2019

sclassen deleted the simplifyAsciiClass branch August 25, 2019 18:23

martinfrancois added cleanup enhancement New feature or request labels Sep 7, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

remove Ascii class #89

remove Ascii class #89

sclassen commented Aug 22, 2019 •

edited

Loading

martinfrancois left a comment

sclassen commented Aug 25, 2019

martinfrancois commented Aug 25, 2019

remove Ascii class #89

remove Ascii class #89

Conversation

sclassen commented Aug 22, 2019 • edited Loading

martinfrancois left a comment

Choose a reason for hiding this comment

sclassen commented Aug 25, 2019

martinfrancois commented Aug 25, 2019

sclassen commented Aug 22, 2019 •

edited

Loading