-
Notifications
You must be signed in to change notification settings - Fork 23
Conversation
Methods in Normal and StudentsT should either use sigma or not, they shouldn't conditionally use sigma depending on if it has been set. The current code races with setSigma(). Fixing it with a mutex is not strictly a data race, but it does mean the behavior of the program depends on execution order. Remove this behavior entirely by just computing the relevant entry of the covariance matrix.
Fixes #163 |
PTAL @kortschak . This fixes the race while also removing complexity in the implementation. |
I'm not convinced (I am opposed); I don't believe there is a race if the type is used correctly - i.e. with a lock provided by the user. What is the performance penalty here in the case with |
The performance penalty is If we want to keep that code path behavior, I strongly believe we should do it interior to the struct and not force the user to do the synchronization. It's difficult to know which methods need to the covariance matrix, and which methods do not. For example, it's likely somewhat surprising (to the average user) you can find the marginal in the first place without needing to compute the covariance matrix. Having the synchronization within the struct means all methods can be called in parallel. Having the synchronization outside the struct means there is a complicated graph of methods that are okay to call in parallel with one another. |
Yes, but in practice, what does that measure as. The issue is whether the O(n) is in practice worse than locking or other checks. I just did an experiment on removing the
My position on the race behaviour is that it isn't a race unless you try hard for it to be one - I don't yet see an argument that changes my mind on this. If there were a significant benefit from being able to call all the methods in parallel, then maybe, but that does seem rather unlikely to me. Can you explain the magnitude of the benefit from being able to do that? At the moment, it is simple and fast. With locking it may be simple, but slower for most cases with (from a quick survey) a reasonably non-trivial locking scheme. The scheme would use a
You can see here that most paths hit the slower lock path (ignoring contention which is likely to be rare I would guess). We could put in a call to |
So, first of all, I discovered this race because it comes up in a real program with mine. I didn't try "hard to" see it, I was just calling the methods in parallel because we had designed the methods to be able to be called in parallel (or so I thought). The actual code I have is complicated, but in short,
I then today changed my code to do each dimension in parallel. Each dimension is independent, so this should be a linear speedup (in particular, note that I'm not generating random numbers, so there's no use of Assuming the sync.Once is removed, I have two options:
If we really want the synchronization to go away, then I think we have two options
|
The third option for your use case is to cause I think there is a third option: keep the behaviour we have now (but with the |
Why don't we just keep If it happens to be in the (quite specific) case where 1) you have the Cholesky and 2) you can update it without reconstructing the covariance, then you can just implement those particular cases themselves. I suspect the most common operation in that case is |
What do you mean by keep I don't have an example, but this seems pretty resonable https://github.com/gonum/stat/tree/setsigma. What do you think? |
Yes, I mean retaining the original The main advantages are implementation simplification (i.e. this discussion), retain no "gotchas" with concurrency (don't have to call SetSigma first), saving computational cost (don't have to recompute sigma), and no loss of precision in reconstructing sigma (for example if the condition number of the original matrix was poor). Exposing SetSigma is okay, but I don't like that it adds caveats to writing correct concurrent code as opposed to correct serial code. My past exposure having methods like that is not positive, because I've found it tends to compose poorly with interfaces. At that point, you start calling SetSigma just in case, at which point we may as well just store it in the first place. |
I can provide an example of it composing poorly with interfaces if you'd like |
Yes please. While there are fewer gotchas with concurrency in one direction, there is the caveat that must be adhered to that the |
Sorry, my mistake. I don't mean retaining the original |
And I additionally mean constructing a new one in the case of |
The main issue with interfaces is that concurrency is built from the bottom up. If any piece of the puzzle cannot be called concurrently, then the whole routine cannot be called concurrently. In contrast, if one is careful to make sure that each piece works well in a concurrent environment, then it's easy to throw a loop around the last part and gain big parallel speedups. The lower-level the break in concurrency, the larger the gymnastics necessary to work around it. All of Gonum's components work well with concurrency, and so it's really easy to make gonum code concurrent. As an extreme contrast, a python package I interact with creates temporary files, and so it's not even clear to me I can launch multiple executions of it with different parameters. In the particular case, my code consists of a number of packages which I call in different ways. The fact that there are these multiple packages is important for me to manage the complexity of everything. Two of the fundamental interfaces are
There are shell types which easily wrap When writing the wrapper type for I don't think this is specific to my code. I want to enable concurrency, and so concurrency must be kept in mind at each level. It seems, thus, that any time someone wants to generate a At that point, we should just SetSigma ourselves, to reduce the future errors, or we should find a workaround (such as |
Interfaces are defined not just by the methods that they implement, but by the documentation that goes with them. The implied documentation that you have for the interfaces above is that the types being used are safe for concurrent use. This is not true for
Only if they are wanting to perform the operations above concurrently. The only requirement is that people think about the work that is being done and we provide adequate documentation about what should be done in particular circumstances.
The synchronisation approaches are fraught or baroque, I would like them to go away here. I remain convinced that this is less of a problem than you make out, but I think you care about it more than I do, so I will relent and ask that we go with the copying of the passed in sigma in the case that it is provided and generated it at construction if given a Cholesky decomposition. The arguments that convince me about this are less the issue with concurrency and more the issues relating to numerical stability in the case that an exact sigma has been provided, but may be degraded on reconstruction. |
As one last thing: I agree with you that interfaces are defined by a documentation and methods. I think the real issue here is not about concurrency at all, but about side effects. In general, functions and methods should not have side effects that are visible, unless it is necessary for them to have those side effects. Non-obvious side effects are hard to remember, and so easy to cause bugs, and also increase the entanglement in the "dependency graph", which good code keeps disentangled as much as possible. My argument isn't* so much that all methods on all types should be able to be called in parallel (the types in Closing to open the pull request with the intended fix. *isn't any more. It was, and I still think we should enable concurrency as much as practical, but I've come to a better understanding of the heart of the issue. |
Fair enough. The side effects though are only visible in a concurrent access regime. The synchronisation types themselves are not baroque, but the complexity of synchronisation actions required for this particular system would be. Synchronisation is just a thing, when you have multiple competing accesses the collections of those things become difficult to reason about. |
Agreed on all accounts, with the addition that synchronization becomes much more difficult when there are a lot of side effects and effects that are only visible concurrently. |
Methods in Normal and StudentsT should either use sigma or not, they shouldn't conditionally use sigma depending on if it has been set. The current code races with setSigma(). Fixing it with a mutex is not strictly a data race, but it does mean the behavior of the program depends on execution order. Remove this behavior entirely by just computing the relevant entry of the covariance matrix.