make all_equal() faster #342

fyrchik · 2019-04-26T19:28:52Z

Hello!
This PR adresses #282 issue. Variant with dedup does not short circuit, but one with all does.
I have also added some benchmarks and test for an empty iterator.

test all_equal                                ... bench:     999,832 ns/iter (+/- 217,245)
test all_equal_default                        ... bench:   4,814,277 ns/iter (+/- 315,335)
test all_equal_for                            ... bench:   2,096,174 ns/iter (+/- 165,596)

Let me know, what do you think.

timvermeulen · 2019-04-28T14:39:05Z

Looks good!

axelf4 · 2019-06-23T20:22:57Z

An alternative implementation could be:

self.try_fold(None, |acc, x| acc.filter(|&prev| x != prev).xor(Some(Some(x))))

timvermeulen · 2019-06-25T05:21:32Z

@axelf4 I haven't done any benchmarks, have you by any chance? I'd be worried that that would perform worse than this PR's implementation because of the extra branching in the closure body. But it's very hard to tell without benchmarks.

phimuemue

Looks good to me.

phimuemue · 2019-07-20T22:52:40Z

src/lib.rs

@@ -1310,9 +1310,13 @@ pub trait Itertools : Iterator {
    /// assert!(data.into_iter().all_equal());
    /// ```
    fn all_equal(&mut self) -> bool
-        where Self::Item: PartialEq,
+        where Self: Sized,


I think requiring Sized is ok. Can anyone with more expertise confirm this is unproblematic?

There are already lots of methods you can't call on an unsized iterator, so I don't think this is problematic.

I'm a little mystified. The documentation for Iterator::all indicates that the only bound is F: FnMut(Self::Item) -> bool, but the implementation of Iterator::all also requires Self: Sized. Why is this bound elided from the documentation?

I believe the documentation (sometimes?) leaves out Self: Sized bounds because they’re so common 😕

Per Reddit, I think this is a bug. I've filed an issue: rust-lang/rust#62899.

phimuemue · 2019-07-20T22:53:26Z

tests/test_std.rs

@@ -100,6 +100,7 @@ fn dedup() {

 #[test]
 fn all_equal() {
+    assert!("".chars().all_equal());


Good idea to include this corner case. Should we add the "one element" corner case, too?

Yes, it will be a nice addtition. I have added it.

Adding a quickcheck test would be a great alternative to either of those (something for another PR)

bluss · 2019-08-20T07:55:42Z

This should not need to require Self: Sized, and I'd prefer to rewrite it so that it is not a breaking change (in my opinion).

I'm not sure I understand the short circuit argument - in what sense does this short circuit more than dedup does? They should both visit the same number of iterator elements.

timvermeulen · 2019-08-20T08:00:56Z

@bluss The dedup version does also short-circuit, but only after the second "group" has been iterated entirely, rather than right after the first element of the second group.

bluss · 2019-08-20T08:24:56Z

Oh. We are long overdue for this improvement then.

One major part of benchmark speed up could be that all is explicitly unrolled in the slice iterator (slice iterator specific, in its try_fold). That's nice to use, but we can note that it is not a general improvement. In addition to that we have the improvement of using internal iteration (all/try_fold etc) which is a bit more general.

Looking at those upsides it seems ok to have a breaking change. I'm not sure in what context it would be a detectable breaking change, it should only change which implementation is picked when we call this method on a trait object.

... and it turns out the trait object question is moot, because trait object support has been broken in Itertools 0.8.0, dyn Itertools is not possible at the moment precisely because of some missing Self: Sized bounds.

That's something we could revisit, but let's only add back trait object support if we can find a good reason. Clearly there was no test for it.

With that, I don't think adding Self: Sized is a breaking change.

jswrenn · 2019-08-20T14:51:04Z

If the trait object question is moot, I'm happy to merge this!

bors r+

jswrenn · 2019-08-20T18:25:39Z

bors r+

342: make all_equal() faster r=jswrenn a=fyrchik Hello! This PR adresses #282 issue. Variant with `dedup` does not short circuit, but one with `all` does. I have also added some benchmarks and test for an empty iterator. ``` test all_equal ... bench: 999,832 ns/iter (+/- 217,245) test all_equal_default ... bench: 4,814,277 ns/iter (+/- 315,335) test all_equal_for ... bench: 2,096,174 ns/iter (+/- 165,596) ``` Let me know, what do you think. Co-authored-by: Evgenii <[email protected]>

bors · 2019-08-20T18:31:29Z

Build succeeded

continuous-integration/travis-ci/push

jswrenn self-assigned this Jul 18, 2019

jswrenn added the waiting-on-review label Jul 18, 2019

phimuemue reviewed Jul 20, 2019

View reviewed changes

fyrchik added 2 commits July 30, 2019 09:40

make all_equal() faster

bb268cd

Add test for one-character string

99002dd

jswrenn added the breaking-change label Aug 2, 2019

bors bot merged commit 99002dd into rust-itertools:master Aug 20, 2019

Philippe-Cholet mentioned this pull request Jan 10, 2024

performance of all_equal #282

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

make all_equal() faster #342

make all_equal() faster #342

fyrchik commented Apr 26, 2019

timvermeulen commented Apr 28, 2019

axelf4 commented Jun 23, 2019

timvermeulen commented Jun 25, 2019

phimuemue left a comment

phimuemue Jul 20, 2019

timvermeulen Jul 20, 2019

jswrenn Jul 22, 2019

timvermeulen Jul 23, 2019

jswrenn Jul 23, 2019

phimuemue Jul 20, 2019 •

edited

Loading

fyrchik Jul 30, 2019

bluss Aug 20, 2019

bluss commented Aug 20, 2019

timvermeulen commented Aug 20, 2019

bluss commented Aug 20, 2019

jswrenn commented Aug 20, 2019

jswrenn commented Aug 20, 2019

bors bot commented Aug 20, 2019

make all_equal() faster #342

make all_equal() faster #342

Conversation

fyrchik commented Apr 26, 2019

timvermeulen commented Apr 28, 2019

axelf4 commented Jun 23, 2019

timvermeulen commented Jun 25, 2019

phimuemue left a comment

Choose a reason for hiding this comment

phimuemue Jul 20, 2019

Choose a reason for hiding this comment

timvermeulen Jul 20, 2019

Choose a reason for hiding this comment

jswrenn Jul 22, 2019

Choose a reason for hiding this comment

timvermeulen Jul 23, 2019

Choose a reason for hiding this comment

jswrenn Jul 23, 2019

Choose a reason for hiding this comment

phimuemue Jul 20, 2019 • edited Loading

Choose a reason for hiding this comment

fyrchik Jul 30, 2019

Choose a reason for hiding this comment

bluss Aug 20, 2019

Choose a reason for hiding this comment

bluss commented Aug 20, 2019

timvermeulen commented Aug 20, 2019

bluss commented Aug 20, 2019

jswrenn commented Aug 20, 2019

jswrenn commented Aug 20, 2019

bors bot commented Aug 20, 2019

Build succeeded

phimuemue Jul 20, 2019 •

edited

Loading