Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Channels, fibers, oh my! #241

Merged
merged 64 commits into from
Oct 24, 2022
Merged

Channels, fibers, oh my! #241

merged 64 commits into from
Oct 24, 2022

Conversation

MarcelGarus
Copy link
Member

@MarcelGarus MarcelGarus commented Sep 11, 2022

This PR adds channels, fibers, try, and environments.

Here are all the new features:

  • core.try { ... } catches panics of the closure and instead returns a result, either [Ok, returnValue] or [Error, panicReason].

  • core.parallel { nursery -> ... } spawns a new parallel scope, in which multiple fibers may run concurrently. Which brings us to the next feature:

  • core.async nursery { ... } spawns a new fiber and returns a future (which is just a receive port). core.await someFuture waits for its result. In a way, core.async and core.await are the opposite of each other (core.await (core.async nursery foo) == foo).
    Note that panics still propagate upwards – a panicking fiber that was spawned on a nursery also causes the surrounding core.parallel to panic. All other fibers spawned on that nursery get canceled (they are not further executed).

  • The candy run command now looks for an exported main function and calls it with an environment.

Here's everything in action:

main environment =
  print message = core.channel.send environment.stdout message

  core.parallel { nursery ->
    core.async nursery { print "Hello, world!" }
    core.async nursery { print "Hallo, Welt!" }
    core.async nursery { print "Hola, mundo!" }

    four = core.async nursery { 4 }
    core.await four
  }

# Executing this prints that the main function returned `4`.

Implementation

Although we already spoke about my initial idea in person, I completely changed my approach since then.

Initially, I planned on implementing a fiber tree using an actual tree of Rust values. However, I didn't manage to implement that in a reasonable amount of time (I tried for ~20 hours total). Instead, I ended up with a very simple approach, inspired by the implementation of the BeamVM. In this approach, there's not an actual tree of structs. Rather, a VM maintains a list of fibers as well as a list of channels.

The main advantage of this new approach is the simplicity of the implementation. In particular, there's less accidental complexity that arises from organizing channels and fibers in a tree. For example, no migrating of channels between different subtrees is necessary because all channels are global to the VM anyway. This especially reduces the complexity in Rust, where ownership can make things a lot less complicated.

Looking forward, this architecture also enables straightforward parallelization. We don't have to lock subtrees of fibers, but instead, we can have a pool of worker threads that try to work on the fibers, for example, by choosing a random fiber from the list, and locking it (if no one else is working on it already).
A more advanced approach (also implemented by the BeamVM) seems to be to have fibers assigned to a particular worker thread (similar to the affinity of OS threads) because this improves the cache hit rate (L1 caches are local per core). This is often combined with work stealing. I'm sure we can implement something similar for our VM.

Performance

Entering parallel scopes and creating new fibers is often veeery slow. The reason for that is that if core is used inside a fiber, then the whole core struct is copied to another fiber. This is not an exponential blowup of runtime like the value-issue we had previously, but fibers are definitely not cheap.

In the future, several optimizations can improve the runtime:

  • Compile-time evaluation and inlining can reduce the amount of code cloned to fibers. For example, a core.int.add may be reduced to ✨.intAdd, so that only needs to be copied to the fiber.
  • Constant heaps can allow code to be shared. Compile-time evaluated values don't need to be manually constructed using instructions like createInt and createStruct. Rather, we could have a separate heap area that contains constant objects. Several other compilers also use this technique (e.g. Lua, Wren). The objects in that heap could have a special header that indicates they are not reference-counted, so dup and drop instructions on those objects are a no-op.
  • Share heaps? Sometimes? It may sometimes make sense to share heaps between multiple fibers. This makes every reference count operation a lot more expensive (they need to be atomic, synchronized between CPU caches, etc.), but if we access very few, random-like bits of a big data structure, this may make sense. For the common case of fibers that work on mostly independent tasks, this handling would only slow them down though. So it depends a lot on the access pattern of fibers.

@MarcelGarus
Copy link
Member Author

Okay, I think my todos are done. The only unpolished part is the tracer. It's not worse than before, but it's far from finished: For now, the tracer returned by the VM's tear_down method only returns the tracer of the outermost fiber, and all the inner ones are ignored/dropped.

The main reason is that its whole architecture relied on being owned by a fiber (for example, it only stores pointers because it doesn't have its own heap).

I also noticed that the way we use the tracer for determining whose fault a panic is isn't quite correct all the time. In particular, we need to handle cases where needs outlive their call sites:

tripleIntGenerator intGenerator =
  safeIntGenerator = {
    result = call intGenerator
    needs (isInt result) "tripleIntGenerator only accepts closures that always return ints"
    result
  }
  { multiply 3 (safeIntGenerator call) }

intGenerator = { "Bad return value" }
foo = tripleIntGenerator intGenerator
bar = foo call

Here, the tripleIntGenerator takes an int-producing closure and returns another closure that always produces trice as much. Because the needs is defined in an internal closure, if it fails that's the fault of the caller of tripleIntGenerator, not the caller of the closure. As a consequence, the assignment of bar panics. However, we currently can't determine that the fault is in the line where we create the foo (because that's the part of the code that calls tripleIntGenerator with a non-int closure). Ideally, we would highlight that line in the editor.

I have ideas on how to solve this but I'd rather do that in a separate PR because it requires more work, and this PR is already quite large. In the next PR, I'll separate the tracer into two parts:

  • A fault analyzing component owned by the fibers. Essentially, fibers would have an additional fault stack (next to the data, call, and import stacks) which contains places of code that can be at fault (I assume the HIR IDs of calls). The top-most item is the one that's at fault if a need panics. Closures would also capture the top value of that stack and when executed, put that on the stack again and remove it once they're done. This would (finally) allow accurately and explicitly finding out who's at fault when code crashes.

  • The tracing part, which would be moved outside the fibers and only borrowed to running fibers. Semantically, that better fits the perspective that fibers report to some service as a side effect, simplifying fibers themselves to state machines only containing essential complexity. Because the VM would own the tracer and provide it to fibers, the VM can also report other events such as fibers being entered and exited during scheduling or channel operations.

@MarcelGarus MarcelGarus marked this pull request as ready for review October 3, 2022 20:22
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
compiler/src/vm/channel.rs Outdated Show resolved Hide resolved
compiler/src/vm/channel.rs Outdated Show resolved Hide resolved
compiler/src/vm/channel.rs Outdated Show resolved Hide resolved
compiler/src/main.rs Outdated Show resolved Hide resolved
compiler/src/vm/tracer.rs Outdated Show resolved Hide resolved
compiler/src/vm/tracer.rs Outdated Show resolved Hide resolved
compiler/src/vm/tracer.rs Outdated Show resolved Hide resolved
compiler/src/vm/tracer.rs Show resolved Hide resolved
@MarcelGarus
Copy link
Member Author

Some comments are about how the performing_fiber relates to the outside world. Tbh, until now, I didn't spend too much time thinking about how the outside world can operate on internal channels. I believe it makes sense to support both pull-based and push-based use cases:

  • External channels are managed in the outside world and are pull-based. The outside world can lazily generate values once Receive operations happen on those channels.
  • Internal channels might reach the outside world over other channels.
    We probably want this. For example, to interact with external services (such as file systems) on a request-response basis, one send and one receive channel are not enough: The send channel might be given to multiple pieces of code that make requests concurrently. They only want to receive responses to their request. A shared response channel makes that impossible – you'd always be in danger of some other fiber receiving your response.
    For those cases, it instead makes sense to create a new one-off channel for each request and send the send port to the outside. This gives the outside world the opportunity to notify that piece of code specifically.

Essentially, this means that the outside world needs to also have a way of sending and receiving data from internal channels. One idea would be to change the performing_fiber that's extensively used in the code from an Option<FiberId> to a custom enum that is something like this:

enum Performer {
  Fiber(FiberId),
  Extern(ExternalOperationId),
  Nursery,
}

This way, the outside world could call receive_from_internal_channel (or something similar) on a VM to get an ExternalOperationId. The VM could then also publish the results of those operations.

I'm not too fond of the name Performer by the way. (Maybe Issuer because those are the entities that issued the operations? Or OperationSource?) If you have suggestions, let me hear them.

@MarcelGarus
Copy link
Member Author

On third thought, maybe it would be even easier if we had no extra concept of external channels at all. Instead, the outside world would always use "internal" channels as well. If we want pull-based behavior, we can still do that using a channel with a capacity of 0.

@JonasWanke
Copy link
Member

Regarding one send and one receive channel for communication with the outside world: That would work if we add an extra layer inside Candy that takes care of multiplexing and demultiplexing the messages. Basically the job that would otherwise fall to the Rust code.

Using the “internal” channels for communication with the outside world as well sounds good since that should simplify the code while still allowing us to accomplish the same

@JonasWanke JonasWanke added the T: Feature Type: New Features label Oct 17, 2022
@MarcelGarus MarcelGarus mentioned this pull request Oct 17, 2022
@MarcelGarus
Copy link
Member Author

MarcelGarus commented Oct 17, 2022

(Although I already requested a review before, now there's really nothing more I'd change.)
Edit: Except for fixing the Clippy lints, apparently.

compiler/src/vm/mod.rs Show resolved Hide resolved
compiler/src/vm/mod.rs Outdated Show resolved Hide resolved
compiler/src/main.rs Outdated Show resolved Hide resolved
packages/Core/concurrency.candy Outdated Show resolved Hide resolved
Co-Authored-By: Jonas Wanke <[email protected]>
@MarcelGarus MarcelGarus merged commit e8c7eb0 into main Oct 24, 2022
@MarcelGarus MarcelGarus deleted the channels branch October 24, 2022 08:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P: Compiler: Frontend Package: The compiler frontend P: Core Package: Candy's standard library T: Feature Type: New Features
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants