Distributed Shell

Old: Project Goals

"Toil" for multi-cloud distributed builds: http://www.oilshell.org/blog/2020/11/fixes-and-updates.html#buildssrht-and-toil.
- Results: http://travis-ci.oilshell.org/
- Problems:
- Does too much work (not incremental), and doesn't do it fast enough (not parallel)
  - needs dependencies for both problems!
- In some cases, the framework has more overhead than the work done by the application. We want lightweight distributed processes.
- YAML is a really bad syntax for a shell script.
Pash and Posh are related: https://github.com/oilshell/oil/issues/867
- Actually Posh is from many of the same authors as gg, but doesn't appear to be open source, and is technically unrelated?
- https://www.usenix.org/conference/atc20/presentation/raghavan. Section 6.2 talks about the execution engine which is not similar to gg. I suppose it is a slightly different problem.

Notes on gg (Ad Hoc Multi-Cloud Distribution with Lambdas)

Great intro blog post, concentrating mostly on the C++ build use case, which indeed has some unique elements: https://buttondown.email/nelhage/archive/papers-i-love-gg/
- reaction: distcc pump is another solution to the preprocessor problem, although neither model substitution or distcc pump are fully general
Great Usenix ATC '19 Video: https://www.youtube.com/watch?v=Cc_MVldSijA&ab_channel=USENIX
- I really like the framing: low latency (which is why I use shell in the first place), warm vs. cold clusters
- IR is extremely similar to Blaze/Forge (and described with a tiny set of protobufs!)
HN comments from July 2019: https://news.ycombinator.com/item?id=20433315
- Lambda still has some limitations for huge packages. Good experience report here (althuogh it sounds like the commenter could benefit from "proper" declared dependencies)
- What about state in lambdas?
My initial reaction: https://lobste.rs/s/virbxa/papers_i_love_gg#c_nbmnod
Concepts
- Model Substitution
- Tail Calls
- Dynamic dependendencies, not static (how does it relate to Shake?)
- Lambdas can talk to each other (via NAT traversal?) Solves a well known performance issue.
Citations
- UCop
- Ciel
My sense on limitations
- It's not a fully general shell parallelizer, because it's mainly about small data and big compute. Some problems are big data and small compute, like analytics (joins, etc.)
Their Notes on Limitations / Future Work
- Worker communication (didn't understand the NAT traversal bit)
- They want to schedule thunks onto GPUs
- A gg DSL! They have a C++ and Python SDK. They say they want "parallel map", "fold", etc. What does this look like?
Questions
- Where does the scheduler run? (on a lambda? Or does the client need to be connected the whole time)
- How does the worker-to-worker communication work?
- What would the DSL look like?

Project Ideas

Well first, try gg to see how well it works...
Really basic:
- Oil can create CLI descriptions for "model substitution"
Second: Oil front end rather (on top of model sub, "scripting", Python, C++). Does that make sense? (That's in their future work -- a DSL)
Run Toil on gg ! For better continuous builds
Does it make sense to augment gg with streams? For shell pipelines?
- dgsh uses Unix domain sockets to implement pipelines
Big project: write an executor that addresses the object distribution problem with differential compression / affinity (e.g. OSTree/casync)
Is there some sort of command line wrapper style that specifies inputs / outputs unambiguously that can be used to wrap every command? Then you don't need model substitution?
Could Oil be a local executor the gg runtime? what does the file system look like?
- you need a component to set up the file system, I guess a user space chroot / bind tool?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Distributed Shell

Notes on gg (Ad Hoc Multi-Cloud Distribution with Lambdas)

Project Ideas

Clone this wiki locally