An LFE client library for the Disco big-data platform.
This project assumes that you have rebar
and lfetool installed somwhere
in your $PATH
.
To inlcude ldisco
in your project, simply update the deps section
of your rebar.config
:
{deps, [
{ldisco, ".*", {git, "git://github.com/thorgisl/ldisco.git"}}
]}
TBD (project still in-progress; worker protocol will likely be the first bit done)
Disco Core and DFS have the overall architectures as depicted in the following diagrams:
The logical flow of code through Disco system can be broken down into the following major areas:
- End-user-created code (e.g.,
map
andreduce
functions) - Creation of job files
- Delivery of job files to Disco server
- Job data delivery and unpacking to worker nodes
- Worker node execution of job
- Delivery of results to Disco
- Retrieval of results by end-user from Disco
With all that background understood, it's possible to create a client library in any language. Our choice, of course, is LFE :-)
In order to build an LFE Disco client, the following components were needed:
- Disco configuration mechanicsm for LFE
- so the client knows where Disco is (port, host, etc.)
- as well as other settings
- Worker implementation
- Something that can read and write the Disco worker protocol
- Something that can read from stdout and write to stdin
- Something that can exercise all possible workflows in the communications between the Disco server and the LFE Disco worker
- Data structures representing tasks, inputs, outputs, replicas, etc.
- Higher-level abstractions for simplifying complicated workflows
- Disco client implementation
- HTTP/JSON
- queries to Disco server, etc.
- results parsing
- Instructions for end-users on how to create Disco jobs in LFE.
Disco can be used for the following:
- submitting an LFE jobpack to Disco via
./bin/disco submit <file>
- delivery of results to Disco happens as part of the worker writing to
stdout
- in-band retrieval of results happens via the worker protocol, possibly wrapped in higher-level abstractions
XXX
For a list of terms as used in this project as well as the Disco project and its documentation, see the glossary.