Refactor co-primes algorithm to be generic & avoid allocation. #36

saeta · 2020-05-23T23:19:27Z

Thank you @dabrahams for the suggestions!

@dabrahams

Thank you @dabrahams for the suggestions!

saeta · 2020-05-23T23:23:23Z

This is a follow-up to #30

saeta · 2020-05-23T23:23:52Z

This relates to umbrella issue: #33

dabrahams

Some work to do on this one…

dabrahams · 2020-05-24T20:48:26Z

Sources/PenguinStructures/NumberOperations.swift

+}
+
+/// A sequence of numbers that are co-prime with `n`, up to `n`.
+public struct PositiveCoprimes<Number: BinaryInteger>: Sequence {


It's a deterministic multi-pass sequence, so it should conform to Collection.

Suggested change

public struct PositiveCoprimes<Number: BinaryInteger>: Sequence {

public struct LesserPositiveCoprimes<Number: BinaryInteger>: Sequence {

I realize this is getting a little unwieldy. That said, I hadn't got it through my thick brain before that there are numbers coprime with and greater than N, so we should consider whether we can make the name more accurate. PositiveValuesCoprimeAndLess…

Surely we can come up with better names than Number and n?

Strawmen:

s/Number/Domain/ ?
s/n/limit/ ?

I'd rather instead of making it conform to collection instead make it a sequence. Concretely, we know there's infinitely many primes, and there's no reason why we should stop at the number itself, so I propose that this actually be an (infinite) sequence instead.

I also dislike the collection protocol here because the size of the collection is unknown until it's been computed, which reduces the value of it being a constant storage abstraction.

I like Domain, but I dislike limit, and think n (or perhaps b) is more appropriate appropriate.

Edit: I'm fine with target (as you suggested), and have made the corresponding edits, but I'm still not fully convinced this is the right choice...

Even if it was infinite, Collection is the right abstraction. Please see “But sequences can be infinite” at https://forums.swift.org/t/pitch-make-collection-super-convenient-and-retire-sequence/12103

The fundamental difference between Collection and Sequence is single vs multipass, and all that implies in terms of algorithms: the ability to subdivide it, to binary search, etc.

I don't know what you're saying about the count and the value of it being a constant storage abstraction. There are many examples of collections whose count is an O(n) computation. I lost the battle to make count be a method because of this long ago.

The one argument I would consider worthwhile against Collection conformance is that the cost of traversal from one element to the next is probably not O(1). But even that is IMO not a good enough reason to avoid collection conformance in Swift; what you do is note in docs that it doesn't provide expected performance, as lazy filter collection does. In general, it's more valuable to have the functionality available even if it's going to be slow.

I'm not at all in love with target either. It's weird that it plays two roles: both the limit and a coprime of the all the elements, which makes it hard to name. n or b would make sense for function parameter names, or if that was a standard name used for a coprime in math, like you use m and b for a linear equation y=mx+b. Otherwise, a stored property has a persistent meaning that is not available from the doc comment near its use site and is thus usually worth naming more explicitly.

From the first post in https://forums.swift.org/t/pitch-make-collection-super-convenient-and-retire-sequence/12103:

A true stream would require significant additional storage or computation to
support multiple passes over its elements. [...]
The initial state of the element generation procedure, which is modified as new elements are generated, is very costly to construct and to store, such as in the Mersenne Twister 4 pseudo-random number generator.
[...]

I would argue that this type falls into roughly the same category. This algorithm seems to me to fall into the same category. Concretely: while it's possible to store previous values (a-la Mersenne Twister) to facilitate indexing, or do an increasing amount of re-computation, these are both unsatisfactory.

You don't need to store previous values. The index just needs to store the same thing as the Iterator does. https://gist.github.com/dabrahams/6119b718aab45d8ef59e141eb37c711e

Hmmm, by switching to Collection, I've gotten rid of all mutable stored properties, and also got rid of the need to create a nested Iterator or Index type (by re-using Domain). While it demands a bit more code, and I'm pretty sure operations like prefix are probably somewhat inefficient, this does seem like an interesting design. It also makes certain functionality pretty easy, such as finding the next positive coprime greater than any arbitrary number.

operations like prefix are probably somewhat inefficient

You could always define a lazy SubSequence type to get around that.

It also makes certain functionality pretty easy, such as finding the next positive coprime greater than any arbitrary number.

Exactly. That's why, “if it can conform, it should conform.”

You could always define a lazy SubSequence type to get around that.

I wondered about that, but the implementation of prefix(_:) doesn't appear to admit an implementation where the subsequence is lazy, as it relies upon directly computing the end index from the call to index(_:offsetBy:limitedBy:) which can't be lazy, and IIUC, there's no .lazy.prefix(n) (although there is a LazyPrefixWhileSequence... haven't played in enough contexts to fully grok what would happen in all situations).

Sources/PenguinParallel/NonblockingThreadPool/NonBlockingThreadPool.swift

Sources/PenguinStructures/NumberOperations.swift

dabrahams · 2020-05-24T21:36:54Z

I wondered whether computing GCD for every value < N was the best way and landed on this: https://math.stackexchange.com/questions/621109/co-prime-numbers-less-than-n

I didn't analyze it, but something to think about… someday. ;-)

Sources/PenguinStructures/NumberOperations.swift

Tests/PenguinStructuresTests/NumberOperationsTests.swift

saeta · 2020-06-07T09:06:21Z

Switched from Sequence to Collection. It's an interesting point in design space. As alluded to, I'm somewhat skeptical that we're getting appropriately efficient operations by default. (e.g. Sequence.prefix avoids redundant computations (at the cost of memory consumption), whereas (as written for now), Collection.prefix doesn't. Probably something to refine further over time, such as defining a SubSequence instead of accepting the Collection's default, but not sure if that's enough; haven't thought enough about it yet.)

I'm sure some of the doc comments could be improved... feedback welcome!

saeta · 2020-06-09T19:48:56Z

@dabrahams aside from our discussion on Sequence vs Collection, do you have further changes you'd like to see?

Sources/PenguinParallel/NonblockingThreadPool/NonBlockingThreadPool.swift

dabrahams · 2020-06-18T15:13:30Z

Sources/PenguinStructures/NumberOperations.swift

+    var positiveSelf = self
+    if positiveSelf < 0 {
+      positiveSelf *= -1  // Workaround for lack of `Swift.abs` on `BinaryInteger`.
+    }


This is a weird definition of "smaller" that apparently(?) from reading the code implies absolute value.
Why not just have the array be empty when self is <= 0?
Why not return PositiveCoprimes.SubSequence?
How can this possibly even compile? .prefix(…) is supposed to return a SubSequence, i.e. slice.

Why not just have the array be empty when self is <= 0?

Yeah, that's reasonable. Fixed.

How can this possibly even compile? .prefix(…) is supposed to return a SubSequence, i.e. slice.

As it turns out, IIUC it's calling Sequence.prefix(while:)

As for why not return PositiveCoprimes.SubSequence, the concern I have is that this will cause us to traverse the collection at least twice: once to find the endIndex, and then a second time to access the elements. Of course, we could make this be a LazyPrefixWhileSequence<SubSequence>, but this forces closure allocation.

The final option would be to do something quite fancy with the SubSequence type. My preference is to avoid getting that fancy without more experience to understand what operations we'd like to support efficiently (as we only get one SubSequence type).

Sources/PenguinStructures/NumberOperations.swift

Co-authored-by: Dave Abrahams <[email protected]>

Sources/PenguinStructures/NumberOperations.swift

saeta

Thanks very much for the review!

saeta · 2020-06-19T20:25:01Z

Sources/PenguinStructures/NumberOperations.swift

+    var positiveSelf = self
+    if positiveSelf < 0 {
+      positiveSelf *= -1  // Workaround for lack of `Swift.abs` on `BinaryInteger`.
+    }


Why not just have the array be empty when self is <= 0?

Yeah, that's reasonable. Fixed.

How can this possibly even compile? .prefix(…) is supposed to return a SubSequence, i.e. slice.

As it turns out, IIUC it's calling Sequence.prefix(while:)

As for why not return PositiveCoprimes.SubSequence, the concern I have is that this will cause us to traverse the collection at least twice: once to find the endIndex, and then a second time to access the elements. Of course, we could make this be a LazyPrefixWhileSequence<SubSequence>, but this forces closure allocation.

The final option would be to do something quite fancy with the SubSequence type. My preference is to avoid getting that fancy without more experience to understand what operations we'd like to support efficiently (as we only get one SubSequence type).

Sources/PenguinStructures/NumberOperations.swift

saeta · 2020-06-19T20:37:56Z

Whoops, looks like your additional comments raced with me fixing & pushing the fixes! PTAL?

dabrahams · 2020-06-22T16:27:47Z

Of course, we could make this be a LazyPrefixWhileSequence, but this forces closure allocation.

Forces allocation? Only if there's a capture and only if you store the thing somewhere such that the closure can't be inlined.

Sources/PenguinParallel/NonblockingThreadPool/NonBlockingThreadPool.swift

Sources/PenguinStructures/NumberOperations.swift

dabrahams · 2020-06-22T16:54:43Z

Sources/PenguinStructures/NumberOperations.swift

+  /// Computes the next index after `index`.
+  public func index(after index: Index) -> Index {
+    var nextCandidate = index
+    while true {


You can do this without a raw loop; use a range and index(where:)

Sources/PenguinStructures/NumberOperations.swift

Co-authored-by: Dave Abrahams <[email protected]>

Sources/PenguinStructures/NumberOperations.swift

saeta · 2020-07-10T17:26:28Z

Sources/PenguinStructures/NumberOperations.swift

+
+  /// `Int.max`, as there are infinitely many prime numbers, and thus infinitely many coprimes
+  /// to a given target.
+  public var count: Int {


TODO: let the default implementation handle this, or do some fancy tricks to actually compute what the numbers of values are, given Domain. (Note: this is tricky, due to not being restricted to BinaryInteger and not FixedWidthInteger.)

saeta · 2020-07-10T17:44:19Z

Tests/PenguinStructuresTests/NumberOperationsTests.swift

+import PenguinStructures
+import XCTest
+
+final class NumberOperationsTests: XCTestCase {


TODO: Add collection tests.

Refactor co-primes algorithm to be generic & avoid allocation.

88513d7

Thank you @dabrahams for the suggestions!

saeta requested a review from dabrahams May 23, 2020 23:19

saeta mentioned this pull request May 23, 2020

Polish fast RNG & make available publicly #37

Closed

dabrahams requested changes May 24, 2020

View reviewed changes

dabrahams reviewed May 24, 2020

View reviewed changes

Sources/PenguinStructures/NumberOperations.swift Outdated Show resolved Hide resolved

dabrahams reviewed May 24, 2020

View reviewed changes

Tests/PenguinStructuresTests/NumberOperationsTests.swift Show resolved Hide resolved

saeta added 2 commits May 24, 2020 23:57

Respond to comments.

9dbb65c

Merge remote-tracking branch 'origin/master' into nicer-numbers

7324d36

saeta requested a review from dabrahams May 27, 2020 21:02

saeta added 2 commits June 7, 2020 01:04

Merge branch 'master' into nicer-numbers

f25bad1

Make PositiveCoprimes a Collection instead of a Sequence.

b64f173

saeta assigned dabrahams Jun 10, 2020

Merge branch 'master' into nicer-numbers

38ffa32

dabrahams reviewed Jun 18, 2020

View reviewed changes

Sources/PenguinParallel/NonblockingThreadPool/NonBlockingThreadPool.swift Outdated Show resolved Hide resolved

dabrahams reviewed Jun 18, 2020

View reviewed changes

Sources/PenguinParallel/NonblockingThreadPool/NonBlockingThreadPool.swift Outdated Show resolved Hide resolved

dabrahams reviewed Jun 18, 2020

View reviewed changes

Sources/PenguinStructures/NumberOperations.swift Outdated Show resolved Hide resolved

dabrahams reviewed Jun 18, 2020

View reviewed changes

Sources/PenguinStructures/NumberOperations.swift Outdated Show resolved Hide resolved

dabrahams reviewed Jun 18, 2020

View reviewed changes

Sources/PenguinStructures/NumberOperations.swift Outdated Show resolved Hide resolved

Apply suggestions from code review

b97c1c2

Co-authored-by: Dave Abrahams <[email protected]>

dabrahams reviewed Jun 19, 2020

View reviewed changes

Sources/PenguinStructures/NumberOperations.swift Show resolved Hide resolved

Sources/PenguinStructures/NumberOperations.swift Outdated Show resolved Hide resolved

Respond to review comments.

4a6e40c

saeta commented Jun 19, 2020

View reviewed changes

Make endIndex documentation match implementation.

db9eaeb

saeta requested a review from dabrahams June 20, 2020 21:22

dabrahams requested changes Jun 22, 2020

View reviewed changes

Apply suggestions from code review

cc7a4db

Co-authored-by: Dave Abrahams <[email protected]>

saeta commented Jul 10, 2020

View reviewed changes

Sources/PenguinStructures/NumberOperations.swift Show resolved Hide resolved

Update Sources/PenguinStructures/NumberOperations.swift

5bfda61

saeta commented Jul 10, 2020

View reviewed changes

texasmichelle changed the base branch from master to main December 8, 2020 01:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor co-primes algorithm to be generic & avoid allocation. #36

Refactor co-primes algorithm to be generic & avoid allocation. #36

saeta commented May 23, 2020

saeta commented May 23, 2020

saeta commented May 23, 2020

dabrahams left a comment

dabrahams May 24, 2020

saeta May 25, 2020

dabrahams May 25, 2020 •

edited

Loading

saeta May 27, 2020

dabrahams May 29, 2020

saeta Jun 7, 2020

dabrahams Jun 8, 2020

saeta Jun 9, 2020

dabrahams commented May 24, 2020

saeta commented Jun 7, 2020

saeta commented Jun 9, 2020

dabrahams Jun 18, 2020

saeta Jun 19, 2020

saeta left a comment

saeta Jun 19, 2020

saeta commented Jun 19, 2020

dabrahams commented Jun 22, 2020

dabrahams Jun 22, 2020

saeta Jul 10, 2020 •

edited

Loading

saeta Jul 10, 2020

	public struct PositiveCoprimes<Number: BinaryInteger>: Sequence {
	public struct LesserPositiveCoprimes<Number: BinaryInteger>: Sequence {

Refactor co-primes algorithm to be generic & avoid allocation. #36

Are you sure you want to change the base?

Refactor co-primes algorithm to be generic & avoid allocation. #36

Conversation

saeta commented May 23, 2020

saeta commented May 23, 2020

saeta commented May 23, 2020

dabrahams left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dabrahams May 25, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dabrahams commented May 24, 2020

saeta commented Jun 7, 2020

saeta commented Jun 9, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

saeta left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

saeta commented Jun 19, 2020

dabrahams commented Jun 22, 2020

Choose a reason for hiding this comment

saeta Jul 10, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dabrahams May 25, 2020 •

edited

Loading

saeta Jul 10, 2020 •

edited

Loading