-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor co-primes algorithm to be generic & avoid allocation. #36
base: main
Are you sure you want to change the base?
Conversation
Thank you @dabrahams for the suggestions!
This is a follow-up to #30 |
This relates to umbrella issue: #33 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some work to do on this one…
} | ||
|
||
/// A sequence of numbers that are co-prime with `n`, up to `n`. | ||
public struct PositiveCoprimes<Number: BinaryInteger>: Sequence { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a deterministic multi-pass sequence, so it should conform to Collection
.
public struct PositiveCoprimes<Number: BinaryInteger>: Sequence { | |
public struct LesserPositiveCoprimes<Number: BinaryInteger>: Sequence { |
I realize this is getting a little unwieldy. That said, I hadn't got it through my thick brain before that there are numbers coprime with and greater than N, so we should consider whether we can make the name more accurate. PositiveValuesCoprimeAndLess
…
Surely we can come up with better names than Number
and n
?
Strawmen:
s/Number/Domain/ ?
s/n/limit/ ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather instead of making it conform to collection instead make it a sequence. Concretely, we know there's infinitely many primes, and there's no reason why we should stop at the number itself, so I propose that this actually be an (infinite) sequence instead.
I also dislike the collection protocol here because the size of the collection is unknown until it's been computed, which reduces the value of it being a constant storage abstraction.
I like Domain
, but I dislike limit
, and think n
(or perhaps b
) is more appropriate appropriate.
Edit: I'm fine with target
(as you suggested), and have made the corresponding edits, but I'm still not fully convinced this is the right choice...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even if it was infinite, Collection is the right abstraction. Please see “But sequences can be infinite” at https://forums.swift.org/t/pitch-make-collection-super-convenient-and-retire-sequence/12103
The fundamental difference between Collection and Sequence is single vs multipass, and all that implies in terms of algorithms: the ability to subdivide it, to binary search, etc.
I don't know what you're saying about the count
and the value of it being a constant storage abstraction. There are many examples of collections whose count
is an O(n) computation. I lost the battle to make count be a method because of this long ago.
The one argument I would consider worthwhile against Collection
conformance is that the cost of traversal from one element to the next is probably not O(1). But even that is IMO not a good enough reason to avoid collection conformance in Swift; what you do is note in docs that it doesn't provide expected performance, as lazy filter collection does. In general, it's more valuable to have the functionality available even if it's going to be slow.
I'm not at all in love with target
either. It's weird that it plays two roles: both the limit and a coprime of the all the elements, which makes it hard to name. n
or b
would make sense for function parameter names, or if that was a standard name used for a coprime in math, like you use m and b for a linear equation y=mx+b. Otherwise, a stored property has a persistent meaning that is not available from the doc comment near its use site and is thus usually worth naming more explicitly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the first post in https://forums.swift.org/t/pitch-make-collection-super-convenient-and-retire-sequence/12103:
A true stream would require significant additional storage or computation to
support multiple passes over its elements. [...]
The initial state of the element generation procedure, which is modified as new elements are generated, is very costly to construct and to store, such as in the Mersenne Twister 4 pseudo-random number generator.
[...]
I would argue that this type falls into roughly the same category. This algorithm seems to me to fall into the same category. Concretely: while it's possible to store previous values (a-la Mersenne Twister) to facilitate indexing, or do an increasing amount of re-computation, these are both unsatisfactory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need to store previous values. The index just needs to store the same thing as the Iterator does. https://gist.github.com/dabrahams/6119b718aab45d8ef59e141eb37c711e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, by switching to Collection
, I've gotten rid of all mutable stored properties, and also got rid of the need to create a nested Iterator
or Index
type (by re-using Domain
). While it demands a bit more code, and I'm pretty sure operations like prefix
are probably somewhat inefficient, this does seem like an interesting design. It also makes certain functionality pretty easy, such as finding the next positive coprime greater than any arbitrary number.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
operations like prefix are probably somewhat inefficient
You could always define a lazy SubSequence
type to get around that.
It also makes certain functionality pretty easy, such as finding the next positive coprime greater than any arbitrary number.
Exactly. That's why, “if it can conform, it should conform.”
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could always define a lazy
SubSequence
type to get around that.
I wondered about that, but the implementation of prefix(_:)
doesn't appear to admit an implementation where the subsequence is lazy, as it relies upon directly computing the end
index from the call to index(_:offsetBy:limitedBy:)
which can't be lazy, and IIUC, there's no .lazy.prefix(n)
(although there is a LazyPrefixWhileSequence
... haven't played in enough contexts to fully grok what would happen in all situations).
Sources/PenguinParallel/NonblockingThreadPool/NonBlockingThreadPool.swift
Outdated
Show resolved
Hide resolved
I wondered whether computing GCD for every value < N was the best way and landed on this: https://math.stackexchange.com/questions/621109/co-prime-numbers-less-than-n I didn't analyze it, but something to think about… someday. ;-) |
Switched from I'm sure some of the doc comments could be improved... feedback welcome! |
@dabrahams aside from our discussion on |
Sources/PenguinParallel/NonblockingThreadPool/NonBlockingThreadPool.swift
Outdated
Show resolved
Hide resolved
Sources/PenguinParallel/NonblockingThreadPool/NonBlockingThreadPool.swift
Outdated
Show resolved
Hide resolved
var positiveSelf = self | ||
if positiveSelf < 0 { | ||
positiveSelf *= -1 // Workaround for lack of `Swift.abs` on `BinaryInteger`. | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a weird definition of "smaller" that apparently(?) from reading the code implies absolute value.
Why not just have the array be empty when self is <= 0?
Why not return PositiveCoprimes.SubSequence?
How can this possibly even compile? .prefix(…)
is supposed to return a SubSequence, i.e. slice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just have the array be empty when self is <= 0?
Yeah, that's reasonable. Fixed.
How can this possibly even compile?
.prefix(…)
is supposed to return a SubSequence, i.e. slice.
As it turns out, IIUC it's calling Sequence.prefix(while:)
As for why not return PositiveCoprimes.SubSequence
, the concern I have is that this will cause us to traverse the collection at least twice: once to find the endIndex
, and then a second time to access the elements. Of course, we could make this be a LazyPrefixWhileSequence<SubSequence>
, but this forces closure allocation.
The final option would be to do something quite fancy with the SubSequence
type. My preference is to avoid getting that fancy without more experience to understand what operations we'd like to support efficiently (as we only get one SubSequence
type).
Co-authored-by: Dave Abrahams <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks very much for the review!
var positiveSelf = self | ||
if positiveSelf < 0 { | ||
positiveSelf *= -1 // Workaround for lack of `Swift.abs` on `BinaryInteger`. | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just have the array be empty when self is <= 0?
Yeah, that's reasonable. Fixed.
How can this possibly even compile?
.prefix(…)
is supposed to return a SubSequence, i.e. slice.
As it turns out, IIUC it's calling Sequence.prefix(while:)
As for why not return PositiveCoprimes.SubSequence
, the concern I have is that this will cause us to traverse the collection at least twice: once to find the endIndex
, and then a second time to access the elements. Of course, we could make this be a LazyPrefixWhileSequence<SubSequence>
, but this forces closure allocation.
The final option would be to do something quite fancy with the SubSequence
type. My preference is to avoid getting that fancy without more experience to understand what operations we'd like to support efficiently (as we only get one SubSequence
type).
Whoops, looks like your additional comments raced with me fixing & pushing the fixes! PTAL? |
Forces allocation? Only if there's a capture and only if you store the thing somewhere such that the closure can't be inlined. |
Sources/PenguinParallel/NonblockingThreadPool/NonBlockingThreadPool.swift
Outdated
Show resolved
Hide resolved
Sources/PenguinParallel/NonblockingThreadPool/NonBlockingThreadPool.swift
Outdated
Show resolved
Hide resolved
/// Computes the next index after `index`. | ||
public func index(after index: Index) -> Index { | ||
var nextCandidate = index | ||
while true { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can do this without a raw loop; use a range and index(where:)
Co-authored-by: Dave Abrahams <[email protected]>
|
||
/// `Int.max`, as there are infinitely many prime numbers, and thus infinitely many coprimes | ||
/// to a given target. | ||
public var count: Int { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: let the default implementation handle this, or do some fancy tricks to actually compute what the numbers of values are, given Domain
. (Note: this is tricky, due to not being restricted to BinaryInteger
and not FixedWidthInteger
.)
import PenguinStructures | ||
import XCTest | ||
|
||
final class NumberOperationsTests: XCTestCase { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: Add collection tests.
Thank you @dabrahams for the suggestions!