10
10
[ ![ Latest] ( https://img.shields.io/github/v/tag/Eventual-Inc/daft-launcher?label=latest&logo=GitHub )] ( https://github.com/Eventual-Inc/daft-launcher/tags )
11
11
[ ![ License] ( https://img.shields.io/badge/daft_launcher-docs-red.svg )] ( https://eventual-inc.github.io/daft-launcher )
12
12
13
- # Daft Launcher
13
+ # Daft Launcher CLI Tool
14
14
15
15
` daft-launcher ` is a simple launcher for spinning up and managing Ray clusters for [ ` daft ` ] ( https://github.com/Eventual-Inc/Daft ) .
16
- It abstracts away all the complexities of dealing with Ray yourself, allowing you to focus on running ` daft ` in a distributed manner.
17
16
18
- ## Capabilities
17
+ ## Goal
18
+
19
+ Getting started with Daft in a local environment is easy.
20
+ However, getting started with Daft in a cloud environment is substantially more difficult.
21
+ So much more difficult, in fact, that users end up spending more time setting up their environment than actually playing with our query engine.
19
22
20
- 1 . Spinning up clusters.
21
- 2 . Listing all available clusters (as well as their statuses).
22
- 3 . Submitting jobs to a cluster.
23
- 4 . Connecting to the cluster (to view the Ray dashboard and submit jobs using the Ray protocol).
24
- 5 . Spinning down clusters.
25
- 6 . Creating configuration files.
26
- 7 . Running raw SQL statements using Daft's SQL API.
23
+ Daft Launcher aims to solve this problem by providing a simple CLI tool to remove all of this unnecessary heavy-lifting.
27
24
28
- ## Currently supported cloud providers
25
+ ## Capabilities
29
26
30
- - [x] AWS
31
- - [ ] GCP
32
- - [ ] Azure
27
+ What Daft Launcher is capable of:
28
+ 1 . Spinning up clusters (Provisioned mode only)
29
+ 2 . Listing all available clusters as well as their statuses (Provisioned mode only)
30
+ 3 . Submitting jobs to a cluster (Both Provisioned and BYOC modes)
31
+ 4 . Connecting to the cluster (Provisioned mode only)
32
+ 5 . Spinning down clusters (Provisioned mode only)
33
+ 6 . Creating configuration files (Both modes)
34
+ 7 . Running raw SQL statements (BYOC mode only)
35
+
36
+ ## Operation Modes
37
+
38
+ Daft Launcher supports two modes of operation:
39
+ - ** Provisioned** : Automatically provisions and manages Ray clusters in AWS
40
+ - ** BYOC (Bring Your Own Cluster)** : Connects to existing Ray clusters in Kubernetes
41
+
42
+ ### Command Groups and Support Matrix
43
+
44
+ | Command Group | Command | Provisioned | BYOC |
45
+ | --------------| ---------| -------------| ------|
46
+ | cluster | up | ✅ | ❌ |
47
+ | | down | ✅ | ❌ |
48
+ | | kill | ✅ | ❌ |
49
+ | | list | ✅ | ❌ |
50
+ | | connect | ✅ | ❌ |
51
+ | | ssh | ✅ | ❌ |
52
+ | job | submit | ✅ | ✅ |
53
+ | | sql | ✅ | ❌ |
54
+ | | status | ✅ | ❌ |
55
+ | | logs | ✅ | ❌ |
56
+ | config | init | ✅ | ✅ |
57
+ | | check | ✅ | ❌ |
58
+ | | export | ✅ | ❌ |
33
59
34
60
## Usage
35
61
36
- You'll need a python package manager installed.
37
- We highly recommend using [ ` uv ` ] ( https://astral.sh/blog/uv ) for all things python!
62
+ ### Pre-requisites
38
63
39
- ### AWS
64
+ You'll need some python package manager installed.
65
+ We recommend using [ ` uv ` ] ( https://astral.sh/blog/uv ) for all things python.
40
66
41
- If you're using AWS, you'll need:
67
+ #### For Provisioned Mode (AWS)
42
68
1 . A valid AWS account with the necessary IAM role to spin up EC2 instances.
43
- This IAM role can either be created by you (assuming you have the appropriate permissions).
44
- Or this IAM role will need to be created by your administrator.
45
- 2 . The [ AWS CLI] ( https://aws.amazon.com/cli ) installed and configured on your machine.
46
- 3 . To login using the AWS CLI.
47
- For full instructions, please look [ here] ( https://google.com ) .
48
-
49
- ## Installation
50
-
51
- Using ` uv ` (recommended):
69
+ This IAM role can either be created by you (assuming you have the appropriate permissions)
70
+ or will need to be created by your administrator.
71
+ 2 . The [ AWS CLI] ( https://aws.amazon.com/cli/ ) installed and configured on your machine.
72
+ 3 . Login using the AWS CLI.
73
+
74
+ #### For BYOC Mode (Kubernetes)
75
+ 1 . A Kubernetes cluster with Ray already deployed
76
+ - Can be local (minikube/kind), cloud-managed (EKS/GKE/AKS), or on-premise.
77
+ - See our [ BYOC setup guides] ( ./docs/byoc/README.md ) for detailed instructions
78
+ 2 . Ray cluster running in your Kubernetes cluster
79
+ - Must be installed and configured using Helm
80
+ - See provider-specific guides for installation steps
81
+ 3 . Daft installed on the Ray cluster
82
+ 4 . ` kubectl ` installed and configured with the correct context
83
+ 5 . Appropriate permissions to access the namespace where Ray is deployed
84
+
85
+ ### SSH Key Setup for Provisioned Mode
86
+
87
+ To enable SSH access and port forwarding for provisioned clusters, you need to:
88
+
89
+ 1 . Create an SSH key pair (if you don't already have one):
90
+ ``` bash
91
+ # Generate a new key pair
92
+ ssh-keygen -t rsa -b 2048 -f ~ /.ssh/daft-key
93
+
94
+ # This will create:
95
+ # ~/.ssh/daft-key (private key)
96
+ # ~/.ssh/daft-key.pub (public key)
97
+ ```
98
+
99
+ 2 . Import the public key to AWS:
100
+ ``` bash
101
+ # Import the public key to AWS
102
+ aws ec2 import-key-pair \
103
+ --key-name " daft-key" \
104
+ --public-key-material fileb://~/.ssh/daft-key.pub
105
+ ```
106
+
107
+ 3 . Set proper permissions on your private key:
108
+ ``` bash
109
+ chmod 600 ~ /.ssh/daft-key
110
+ ```
111
+
112
+ 4 . Update your daft configuration to use this key:
113
+ ``` toml
114
+ [setup .provisioned ]
115
+ # ... other config ...
116
+ ssh-private-key = " ~/.ssh/daft-key" # Path to your private key
117
+ ssh-user = " ubuntu" # User depends on the AMI (ubuntu for Ubuntu AMIs)
118
+ ```
119
+
120
+ Notes:
121
+ - The key name in AWS must match the name of your key file (without the extension)
122
+ - The private key must be readable only by you (hence the chmod 600)
123
+ - Different AMIs use different default users:
124
+ - Ubuntu AMIs: use "ubuntu"
125
+ - Amazon Linux AMIs: use "ec2-user"
126
+ - Make sure this matches your ` ssh-user ` configuration
127
+
128
+ ### Installation
129
+
130
+ Using ` uv ` :
52
131
53
132
``` bash
54
133
# create project
@@ -64,32 +143,92 @@ source .venv/bin/activate
64
143
uv pip install daft-launcher
65
144
```
66
145
67
- ## Example
146
+ ### Example Usage
147
+
148
+ All interactions with Daft Launcher are primarily communicated via a configuration file.
149
+ By default, Daft Launcher will look inside your ` $CWD ` for a file named ` .daft.toml ` .
150
+ You can override this behaviour by specifying a custom configuration file.
151
+
152
+ #### Provisioned Mode (AWS)
68
153
69
- ``` sh
70
- # create a new configuration file
71
- daft init
154
+ ``` bash
155
+ # Initialize a new provisioned mode configuration
156
+ daft config init --provider provisioned
157
+ # or use the default provider (provisioned)
158
+ daft config init
159
+
160
+ # Cluster management
161
+ daft provisioned up
162
+ daft provisioned list
163
+ daft provisioned connect
164
+ daft provisioned ssh
165
+ daft provisioned down
166
+ daft provisioned kill
167
+
168
+ # Job management (works in both modes)
169
+ daft job submit example-job
170
+ daft job status example-job
171
+ daft job logs example-job
172
+
173
+ # Configuration management
174
+ daft config check
175
+ daft config export
72
176
```
73
- That should create a configuration file for you.
74
- Feel free to modify some of the configuration values.
75
- If you have any confusions on a value, you can always run ` daft check ` to check the syntax and schema of your configuration file.
76
177
77
- Once you're content with your configuration file, go back to your terminal and run the following:
178
+ #### BYOC Mode (Kubernetes)
78
179
79
- ``` sh
80
- # spin your cluster up
81
- daft up
180
+ ``` bash
181
+ # Initialize a new BYOC mode configuration
182
+ daft config init --provider byoc
183
+ ```
82
184
83
- # list all the active clusters
84
- daft list
185
+ ### Configuration Files
85
186
86
- # submit a directory and command to run on the cluster
87
- # (where `my-job-name` should be an entry in your .daft.toml file)
88
- daft submit my-job-name
187
+ You can specify a custom configuration file path with the ` -c ` flag:
188
+ ``` bash
189
+ daft -c my-config.toml job submit example-job
190
+ ```
89
191
90
- # run a direct SQL query on daft
91
- daft sql " SELECT * FROM my_table WHERE column = 'value'"
192
+ Example Provisioned mode configuration:
193
+ ``` toml
194
+ [setup ]
195
+ name = " my-daft-cluster"
196
+ version = " 0.1.0"
197
+ provider = " provisioned"
198
+ dependencies = [] # Optional additional Python packages to install
199
+
200
+ [setup .provisioned ]
201
+ region = " us-west-2"
202
+ number-of-workers = 4
203
+ ssh-user = " ubuntu"
204
+ ssh-private-key = " ~/.ssh/daft-key"
205
+ instance-type = " i3.2xlarge"
206
+ image-id = " ami-04dd23e62ed049936"
207
+ iam-instance-profile-name = " YourInstanceProfileName" # Optional
208
+
209
+ [run ]
210
+ pre-setup-commands = []
211
+ post-setup-commands = []
212
+
213
+ [[job ]]
214
+ name = " example-job"
215
+ command = " python my_script.py"
216
+ working-dir = " ~/my_project"
217
+ ```
92
218
93
- # finally, once you're done, spin the cluster down
94
- daft down
219
+ Example BYOC mode configuration:
220
+ ``` toml
221
+ [setup ]
222
+ name = " my-daft-cluster"
223
+ version = " 0.1.0"
224
+ provider = " byoc"
225
+ dependencies = [] # Optional additional Python packages to install
226
+
227
+ [setup .byoc ]
228
+ namespace = " default" # Optional, defaults to "default"
229
+
230
+ [[job ]]
231
+ name = " example-job"
232
+ command = " python my_script.py"
233
+ working-dir = " ~/my_project"
95
234
```
0 commit comments