Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New channels implementation for ORC #17305

Merged
merged 58 commits into from
Mar 12, 2021
Merged

New channels implementation for ORC #17305

merged 58 commits into from
Mar 12, 2021

Conversation

ringabout
Copy link
Member

@ringabout ringabout commented Mar 9, 2021

Thanks @mratsim !

The new implementation is mostly based on https://github.com/mratsim/weave/blob/5696d94e6358711e840f8c0b7c684fcc5cbd4472/unused/channels/channels_legacy.nim.

So I keep the license and the link.

Related article:
nim-lang/website#274

@ringabout ringabout marked this pull request as ready for review March 9, 2021 08:06
@ringabout ringabout requested a review from timotheecour March 9, 2021 08:07
lib/std/channel.nim Outdated Show resolved Hide resolved
lib/std/channel.nim Outdated Show resolved Hide resolved
lib/std/channel.nim Outdated Show resolved Hide resolved
lib/std/channel.nim Outdated Show resolved Hide resolved
lib/std/channel.nim Outdated Show resolved Hide resolved
lib/std/channel.nim Outdated Show resolved Hide resolved
lib/std/channel.nim Outdated Show resolved Hide resolved
lib/std/channels.nim Outdated Show resolved Hide resolved
lib/std/channels.nim Outdated Show resolved Hide resolved
@ringabout ringabout requested a review from mratsim March 11, 2021 15:25
@ringabout ringabout assigned ringabout and unassigned ringabout Mar 11, 2021
@ringabout ringabout marked this pull request as draft March 11, 2021 15:26
@ringabout ringabout marked this pull request as ready for review March 11, 2021 16:20

func peek*[T](c: Channel[T]): int {.inline.} = peek(c.d)

proc newChannel*[T](elements = 30): Channel[T] =
Copy link
Member

@timotheecour timotheecour Mar 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • document that elements = 1 will cause channel to be unbuffered
  • magic number 30 sounds weird; how about making the default unbuffered instead (and using an impossible value, say -1, to denote that the channel is unbufferred)
  • isn't typedesc param more common now-days? [T] is mostly useful for implicit generic instantiation

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make sense

@timotheecour
Copy link
Member

/cc @mratsim
why are we using something like a BoundedQueue (refs https://github.com/mratsim/weave/blob/2e528bd2a6a04306fd029f04a0af87fa201e5d89/weave/datatypes/bounded_queues.nim#L42) instead of a std/deques?

with current API, you'd need to know ahead of time the capacity to use, which is in practice really hard: we have an impossible dilemma to solve:

  • either use a capacity that's too large (over-provisions, wastes memory)
  • or use a capacity that's too small (under-provisions, causes wait's when buffer fills up, affecting performance)

with std/queues, you avoid this impossible dilemma and let the buffer grow as needed. We could improve this by adding a maximum cap on the deques underlying buffer but that's a minor addition to an existing module.

Furthermore, even if somehow a BoundedQueue would be better, IMO that part could be factored out and reused, it'd be useful not just in the context of std/channels

Copy link
Member

@timotheecour timotheecour left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM to make progress, with no expectation of API stability at this point.

I'm really curious about whether we should instead use std/dequeues, and a few other points, but can be addressed in followup work

@Araq
Copy link
Member

Araq commented Mar 12, 2021

with std/queues, you avoid this impossible dilemma and let the buffer grow as needed. We could improve this by adding a maximum cap on the deques underlying buffer but that's a minor addition to an existing module.

In my experience this is the far better design than unbounded queues, these tend to consume way too much memory and you lose the flow control aspect of bounded queues. Golang also uses bounded queues as far as I know and never revisited this design decision.

@mratsim
Copy link
Collaborator

mratsim commented Mar 12, 2021

/cc @mratsim
why are we using something like a BoundedQueue (refs https://github.com/mratsim/weave/blob/2e528bd2a6a04306fd029f04a0af87fa201e5d89/weave/datatypes/bounded_queues.nim#L42) instead of a std/deques?

with current API, you'd need to know ahead of time the capacity to use, which is in practice really hard: we have an impossible dilemma to solve:

* either use a capacity that's too large (over-provisions, wastes memory)

* or use a capacity that's too small (under-provisions, causes wait's when buffer fills up, affecting performance)

with std/queues, you avoid this impossible dilemma and let the buffer grow as needed. We could improve this by adding a maximum cap on the deques underlying buffer but that's a minor addition to an existing module.

Furthermore, even if somehow a BoundedQueue would be better, IMO that part could be factored out and reused, it'd be useful not just in the context of std/channels

There are multiple reasons:

  1. Very often, either the producer is much faster than the consumer or the consumer is much faster than the producer.
    There are 3 ways to handle the first case at the channel level, buffering, blocking or dropping.

    • Buffering is the unbounded queue solution you propose, remember that ultimately they are bounded by your memory
    • Dropping would be a ring buffer with the oldest message being overwritten
    • Blocking is the bounded queue

    Additionally, the producer can throttle itself and adapt it's rate of message, which assumes it has some feedback about the consumer consumption speed.

    I think we can all agree that dropping is too niche for a standard library solution which leaves discussing buffering or blocking.

    Buffering is an out-of-memory waiting to happen, with no recourse for the producer to detect if it's too fast. People need to think about their buffer size in a multithreading scenario and bounded queues are needed for backpressure management. This allows producers to adapt and potentially throttle themselves in case the consumer is overloaded. Usually, the messages are coming from IO (say networking), and passed via channels to a compute thread, blocking channels are the way for compute thread to communicate that it's overloaded and that messages cannot be handled so an alternative should be found (maybe throttle the source, maybe deactivate non-essential services/computations, maybe ban it because it's a DOS, ...).

    On the other hand, you can easily emulate an unbounded channel from a bounded channel by using trySend and if it blocks, putting the data in your own buffer in front of the channel. It's also quite possible that the buffering shouldn't happen in the channel or just in front of it but more upstream in the "offending" producer because multiple-producer can enqueue messages in the channel and only one can misbehave, or some should have higher priorities than other in case of congestion.

  2. You can't use the 2 locks solution with unbounded channels because on reaching the bound, the producer must lock the consumer out while resizing. Hence you get higher lock contention and worse performance.

  3. When you need unbounded channels, usually you use list-based channels.

References on backpressure:

@timotheecour
Copy link
Member

timotheecour commented Mar 12, 2021

you're comparing BoundedQueue vs unbounded std/dequeues, but I was referring to std/dequeues with a max capacity beyond which you'd fallback to essentially the same as a BoundedQueue.

  • the initial size parameter becomes irrelevant, and doesn't even need to be a user parameter, could be set to a fixed small value (say 0) as initial queue capacity
  • the capacity of the dequeue keeps doubling (as in deques) as needed until a max capacity is reached
  • then it falls back to the BoundedQueue behavior

This gives best of both worlds:

  • no risk of over or under provision
  • no unbounded memory use, in fact same memory use as BoundedQueue in the worst case; the max capacity parameter (the only parameter) is not very sensitive since if you overprovision it, you won't pay the memory cost unless you actually need it (eg: producer faster than consumer)
  • as future refinement, it could also shrink using standard algorithm (2 hysteresis thresholds) which can be useful to dynamically adapt while maintaining resource usage proportional to current needs in a long running scenario

@Araq
Copy link
Member

Araq commented Mar 12, 2021

We don't have an implementation for your "best of both worlds" though and it's not clear how it would perform.

@mratsim
Copy link
Collaborator

mratsim commented Mar 12, 2021

std/dequeues does over-provision because it requires power of 2 size, so if you use 33 as your capacity, you will allocate for 64 slots and pay for that cost. The current channels proposed allocate exactly what the user wants and provide excellent performance and overhead characteristics.

Now std/dequeues implementation can be improved to use the current PR backend implementation but that's for a separate PR I think.

For specialized needs, the producer can compose the base channel with a buffer in front that provides the extra leeway needed, they can choose a small channel and implement a buffer with the hysteresis strategy you mention as ans example.

In general, all the dynamism and complex logic should be done outside of the base channel.
Alternatively they might need to write their own channels, but it's easier to swap a simple channel and reuse their additional logic than reimplement all of what they did on some channels.

I've mentioned it in the channels API RFC PR nim-lang/RFCs#347 (comment)
There is no "2 worlds" regarding channels, we should provide a good default and compose well because it is impossible to write a channel that satisfies all the potential expectations of a channel:

  • single-producer or multi-producer
  • single-consumer or multi-consumer
  • list-based or array-based or hybrid
    • If list-based:
      • intrusive linked list or non-intrusive
  • bounded or unbounded
    • if bounded:
      • Errors on reaching bounds
      • Blocks on reaching bounds
      • Overwrite the oldest item on reaching bounds (ring-buffer)
  • lock-based or atomic-based
  • progress guarantees:
    • no guarantees (whether using locks or atomics)
    • obstruction-free
    • lockfree
    • waitfree
  • priorities support
  • item ordering guarantees
    • no guarantees
    • serializability
    • linearizability
  • Buffered or unbuffered (rendez-vous)

@Araq
Copy link
Member

Araq commented Mar 12, 2021

is there anything preventing std/channels from being usable with gc:refc (not necessarily now but in future work) ?
at least a subset of the API or for a subset of the types being sent/received

I think there is no good way to make these work with the old GCs. We will eventually make --gc:orc the default for a reason. ;-)

if so then API's in system/channels could be eventually deprecated and removed

That's true.

@timotheecour
Copy link
Member

timotheecour commented Mar 12, 2021

std/dequeues does over-provision because it requires power of 2 size, so if you use 33 as your capacity, you will allocate for 64 slots and pay for that cost

it doesn't over-provision, you do as follows:
given user provided max capacity C (C = 33 in your case):
you double capacity as needed (starting from 1) until it reaches the max power of 2 <= C (ie, 32 in your case), and when this fills up, you increase the length 1 final time to C ( = 33) and switch to BoundedQueue regime.

so yes, you do get best of both worlds: no overprovisioning and pay-what-you-actually-use, instead of a max capacity.

performance wise, it's at least as good as BoundedQueue.

@Araq
Copy link
Member

Araq commented Mar 12, 2021

Merging for now, we can refine it later.

@Araq Araq merged commit a0b8a3d into nim-lang:devel Mar 12, 2021
ringabout added a commit to ringabout/Nim that referenced this pull request Mar 22, 2021
* Update lib/std/channels.nim
* Rename tchannel_pthread.nim to tchannels_pthread.nim
* Rename tchannel_simple.nim to tchannels_simple.nim

Co-authored-by: Mamy Ratsimbazafy <[email protected]>
@timotheecour timotheecour mentioned this pull request Mar 22, 2021
3 tasks
ardek66 pushed a commit to ardek66/Nim that referenced this pull request Mar 26, 2021
* Update lib/std/channels.nim
* Rename tchannel_pthread.nim to tchannels_pthread.nim
* Rename tchannel_simple.nim to tchannels_simple.nim

Co-authored-by: Mamy Ratsimbazafy <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants