daft-launcher
is a simple launcher for spinning up and managing Ray clusters for daft
.
It abstracts away all the complexities of dealing with Ray yourself, allowing you to focus on running daft
in a distributed manner.
- Spinning up clusters.
- Listing all available clusters (as well as their statuses).
- Submitting jobs to a cluster.
- Connecting to the cluster (to view the Ray dashboard and submit jobs using the Ray protocol).
- Spinning down clusters.
- Creating configuration files.
- Running raw SQL statements using Daft's SQL API.
- AWS
- GCP
- Azure
You'll need a python package manager installed.
We highly recommend using uv
for all things python!
If you're using AWS, you'll need:
- A valid AWS account with the necessary IAM role to spin up EC2 instances. This IAM role can either be created by you (assuming you have the appropriate permissions). Or this IAM role will need to be created by your administrator.
- The AWS CLI installed and configured on your machine.
- To login using the AWS CLI. For full instructions, please look here.
Using uv
(recommended):
# create project
mkdir my-project
cd my-project
# initialize project and setup virtual env
uv init
uv venv
source .venv/bin/activate
# install launcher
uv pip install daft-launcher
# create a new configuration file
daft init
That should create a configuration file for you.
Feel free to modify some of the configuration values.
If you have any confusions on a value, you can always run daft check
to check the syntax and schema of your configuration file.
Once you're content with your configuration file, go back to your terminal and run the following:
# spin your cluster up
daft up
# list all the active clusters
daft list
# submit a directory and command to run on the cluster
# (where `my-job-name` should be an entry in your .daft.toml file)
daft submit my-job-name
# run a direct SQL query on daft
daft sql "SELECT * FROM my_table WHERE column = 'value'"
# finally, once you're done, spin the cluster down
daft down