Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement separate storage buckets in flyte-core #4675

Closed
wants to merge 8 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 6 additions & 6 deletions charts/flyte-core/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -90,11 +90,11 @@ helm install gateway bitnami/contour -n flyte
| configmap.clusters.clusterConfigs | list | `[]` | |
| configmap.clusters.labelClusterMap | object | `{}` | |
| configmap.console | object | `{"BASE_URL":"/console","CONFIG_DIR":"/etc/flyte/config"}` | Configuration for Flyte console UI |
| configmap.copilot | object | `{"plugins":{"k8s":{"co-pilot":{"image":"cr.flyte.org/flyteorg/flytecopilot:v1.10.7-b0","name":"flyte-copilot-","start-timeout":"30s"}}}}` | Copilot configuration |
| configmap.copilot.plugins.k8s.co-pilot | object | `{"image":"cr.flyte.org/flyteorg/flytecopilot:v1.10.7-b0","name":"flyte-copilot-","start-timeout":"30s"}` | Structure documented [here](https://pkg.go.dev/github.com/lyft/[email protected]/go/tasks/pluginmachinery/flytek8s/config#FlyteCoPilotConfig) |
| configmap.core | object | `{"manager":{"pod-application":"flytepropeller","pod-template-container-name":"flytepropeller","pod-template-name":"flytepropeller-template"},"propeller":{"downstream-eval-duration":"30s","enable-admin-launcher":true,"leader-election":{"enabled":true,"lease-duration":"15s","lock-config-map":{"name":"propeller-leader","namespace":"flyte"},"renew-deadline":"10s","retry-period":"2s"},"limit-namespace":"all","max-workflow-retries":30,"metadata-prefix":"metadata/propeller","metrics-prefix":"flyte","prof-port":10254,"queue":{"batch-size":-1,"batching-interval":"2s","queue":{"base-delay":"5s","capacity":1000,"max-delay":"120s","rate":100,"type":"maxof"},"sub-queue":{"capacity":100,"rate":10,"type":"bucket"},"type":"batch"},"rawoutput-prefix":"s3://my-s3-bucket/","workers":4,"workflow-reeval-duration":"30s"},"webhook":{"certDir":"/etc/webhook/certs","serviceName":"flyte-pod-webhook"}}` | Core propeller configuration |
| configmap.copilot | object | `{"plugins":{"k8s":{"co-pilot":{"image":"cr.flyte.org/flyteorg/flytecopilot:v1.10.6","name":"flyte-copilot-","start-timeout":"30s"}}}}` | Copilot configuration |
| configmap.copilot.plugins.k8s.co-pilot | object | `{"image":"cr.flyte.org/flyteorg/flytecopilot:v1.10.6","name":"flyte-copilot-","start-timeout":"30s"}` | Structure documented [here](https://pkg.go.dev/github.com/lyft/[email protected]/go/tasks/pluginmachinery/flytek8s/config#FlyteCoPilotConfig) |
| configmap.core | object | `{"manager":{"pod-application":"flytepropeller","pod-template-container-name":"flytepropeller","pod-template-name":"flytepropeller-template"},"propeller":{"downstream-eval-duration":"30s","enable-admin-launcher":true,"leader-election":{"enabled":true,"lease-duration":"15s","lock-config-map":{"name":"propeller-leader","namespace":"flyte"},"renew-deadline":"10s","retry-period":"2s"},"limit-namespace":"all","max-workflow-retries":30,"metadata-prefix":"metadata/propeller","metrics-prefix":"flyte","prof-port":10254,"queue":{"batch-size":-1,"batching-interval":"2s","queue":{"base-delay":"5s","capacity":1000,"max-delay":"120s","rate":100,"type":"maxof"},"sub-queue":{"capacity":100,"rate":10,"type":"bucket"},"type":"batch"},"workers":4,"workflow-reeval-duration":"30s"},"webhook":{"certDir":"/etc/webhook/certs","serviceName":"flyte-pod-webhook"}}` | Core propeller configuration |
| configmap.core.manager | object | `{"pod-application":"flytepropeller","pod-template-container-name":"flytepropeller","pod-template-name":"flytepropeller-template"}` | follows the structure specified [here](https://pkg.go.dev/github.com/flyteorg/flytepropeller/manager/config#Config). |
| configmap.core.propeller | object | `{"downstream-eval-duration":"30s","enable-admin-launcher":true,"leader-election":{"enabled":true,"lease-duration":"15s","lock-config-map":{"name":"propeller-leader","namespace":"flyte"},"renew-deadline":"10s","retry-period":"2s"},"limit-namespace":"all","max-workflow-retries":30,"metadata-prefix":"metadata/propeller","metrics-prefix":"flyte","prof-port":10254,"queue":{"batch-size":-1,"batching-interval":"2s","queue":{"base-delay":"5s","capacity":1000,"max-delay":"120s","rate":100,"type":"maxof"},"sub-queue":{"capacity":100,"rate":10,"type":"bucket"},"type":"batch"},"rawoutput-prefix":"s3://my-s3-bucket/","workers":4,"workflow-reeval-duration":"30s"}` | follows the structure specified [here](https://pkg.go.dev/github.com/flyteorg/flytepropeller/pkg/controller/config). |
| configmap.core.propeller | object | `{"downstream-eval-duration":"30s","enable-admin-launcher":true,"leader-election":{"enabled":true,"lease-duration":"15s","lock-config-map":{"name":"propeller-leader","namespace":"flyte"},"renew-deadline":"10s","retry-period":"2s"},"limit-namespace":"all","max-workflow-retries":30,"metadata-prefix":"metadata/propeller","metrics-prefix":"flyte","prof-port":10254,"queue":{"batch-size":-1,"batching-interval":"2s","queue":{"base-delay":"5s","capacity":1000,"max-delay":"120s","rate":100,"type":"maxof"},"sub-queue":{"capacity":100,"rate":10,"type":"bucket"},"type":"batch"},"workers":4,"workflow-reeval-duration":"30s"}` | follows the structure specified [here](https://pkg.go.dev/github.com/flyteorg/flytepropeller/pkg/controller/config). |
| configmap.datacatalogServer | object | `{"application":{"grpcPort":8089,"grpcServerReflection":true,"httpPort":8080},"datacatalog":{"heartbeat-grace-period-multiplier":3,"max-reservation-heartbeat":"30s","metrics-scope":"datacatalog","profiler-port":10254,"storage-prefix":"metadata/datacatalog"}}` | Datacatalog server config |
| configmap.domain | object | `{"domains":[{"id":"development","name":"development"},{"id":"staging","name":"staging"},{"id":"production","name":"production"}]}` | Domains configuration for Flyte projects. This enables the specified number of domains across all projects in Flyte. |
| configmap.enabled_plugins.tasks | object | `{"task-plugins":{"default-for-task-types":{"container":"container","container_array":"k8s-array","sidecar":"sidecar"},"enabled-plugins":["container","sidecar","k8s-array"]}}` | Tasks specific configuration [structure](https://pkg.go.dev/github.com/flyteorg/flytepropeller/pkg/controller/nodes/task/config#GetConfig) |
Expand Down Expand Up @@ -255,8 +255,8 @@ helm install gateway bitnami/contour -n flyte
| sparkoperator.enabled | bool | `false` | - enable or disable Sparkoperator deployment installation |
| sparkoperator.plugin_config | object | `{"plugins":{"spark":{"spark-config-default":[{"spark.hadoop.fs.s3a.aws.credentials.provider":"com.amazonaws.auth.DefaultAWSCredentialsProviderChain"},{"spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version":"2"},{"spark.kubernetes.allocation.batch.size":"50"},{"spark.hadoop.fs.s3a.acl.default":"BucketOwnerFullControl"},{"spark.hadoop.fs.s3n.impl":"org.apache.hadoop.fs.s3a.S3AFileSystem"},{"spark.hadoop.fs.AbstractFileSystem.s3n.impl":"org.apache.hadoop.fs.s3a.S3A"},{"spark.hadoop.fs.s3.impl":"org.apache.hadoop.fs.s3a.S3AFileSystem"},{"spark.hadoop.fs.AbstractFileSystem.s3.impl":"org.apache.hadoop.fs.s3a.S3A"},{"spark.hadoop.fs.s3a.impl":"org.apache.hadoop.fs.s3a.S3AFileSystem"},{"spark.hadoop.fs.AbstractFileSystem.s3a.impl":"org.apache.hadoop.fs.s3a.S3A"},{"spark.hadoop.fs.s3a.multipart.threshold":"536870912"},{"spark.blacklist.enabled":"true"},{"spark.blacklist.timeout":"5m"},{"spark.task.maxfailures":"8"}]}}}` | Spark plugin configuration |
| sparkoperator.plugin_config.plugins.spark.spark-config-default | list | `[{"spark.hadoop.fs.s3a.aws.credentials.provider":"com.amazonaws.auth.DefaultAWSCredentialsProviderChain"},{"spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version":"2"},{"spark.kubernetes.allocation.batch.size":"50"},{"spark.hadoop.fs.s3a.acl.default":"BucketOwnerFullControl"},{"spark.hadoop.fs.s3n.impl":"org.apache.hadoop.fs.s3a.S3AFileSystem"},{"spark.hadoop.fs.AbstractFileSystem.s3n.impl":"org.apache.hadoop.fs.s3a.S3A"},{"spark.hadoop.fs.s3.impl":"org.apache.hadoop.fs.s3a.S3AFileSystem"},{"spark.hadoop.fs.AbstractFileSystem.s3.impl":"org.apache.hadoop.fs.s3a.S3A"},{"spark.hadoop.fs.s3a.impl":"org.apache.hadoop.fs.s3a.S3AFileSystem"},{"spark.hadoop.fs.AbstractFileSystem.s3a.impl":"org.apache.hadoop.fs.s3a.S3A"},{"spark.hadoop.fs.s3a.multipart.threshold":"536870912"},{"spark.blacklist.enabled":"true"},{"spark.blacklist.timeout":"5m"},{"spark.task.maxfailures":"8"}]` | Spark default configuration |
| storage | object | `{"bucketName":"my-s3-bucket","custom":{},"enableMultiContainer":false,"gcs":null,"limits":{"maxDownloadMBs":10},"s3":{"accessKey":"","authType":"iam","region":"us-east-1","secretKey":""},"type":"sandbox"}` | ---------------------------------------------------- STORAGE SETTINGS |
| storage.bucketName | string | `"my-s3-bucket"` | bucketName defines the storage bucket flyte will use. Required for all types except for sandbox. |
| storage | object | `{"bucketName":"my-s3-bucket","custom":{},"enableMultiContainer":false,"gcs":null,"limits":{"maxDownloadMBs":10},"s3":{"accessKey":"","authType":"iam","region":"us-east-1","secretKey":""},"type":"sandbox","userBucketName":"my-s3-bucket"}` | ---------------------------------------------------- STORAGE SETTINGS |
| storage.bucketName | string | `"my-s3-bucket"` | bucketName defines the storage bucket Flyte will use. Required for all types except for sandbox. |
| storage.custom | object | `{}` | Settings for storage type custom. See https://github.com/graymeta/stow for supported storage providers/settings. |
| storage.enableMultiContainer | bool | `false` | toggles multi-container storage config |
| storage.gcs | string | `nil` | settings for storage type gcs |
Expand Down
23 changes: 21 additions & 2 deletions charts/flyte-core/templates/_helpers.tpl
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,6 @@
{{- default .Release.Namespace .Values.forceNamespace | trunc 63 | trimSuffix "-" -}}
{{- end -}}


{{- define "flyteadmin.name" -}}
flyteadmin
{{- end -}}
Expand Down Expand Up @@ -119,6 +118,11 @@ helm.sh/chart: {{ include "flyte.chart" . }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
{{- end -}}

{{- define "flytepropeller-userstorage" -}}
propeller:
rawoutput-prefix: {{ include "flyte-core.storage.userDataPrefix" . }}
{{- end -}}

{{- define "flyte-pod-webhook.name" -}}
flyte-pod-webhook
{{- end -}}
Expand Down Expand Up @@ -156,7 +160,22 @@ app.kubernetes.io/managed-by: {{ .Release.Service }}
{{- end }}
{{- end }}

{{/*
Get the Flyte user data prefix.
*/}}
{{- define "flyte-core.storage.userDataPrefix" -}}
{{- $userBucketName := required "User data container required" .Values.storage.userBucketName -}}
{{- if eq "s3" .Values.storage.type -}}
{{- printf "s3://%s/data" $userBucketName -}}
{{- else if eq "gcs" .Values.storage.type -}}
{{- printf "gs://%s/data" $userBucketName -}}
{{- else if eq "azure" .Values.storage.type -}}
{{- printf "abfs://%s/data" $userBucketName -}}
{{- end -}}
{{- end -}}

{{- define "storage.base" -}}
{{ include "flytepropeller-userstorage" .}}
storage:
{{- if eq .Values.storage.type "s3" }}
type: s3
Expand Down Expand Up @@ -204,4 +223,4 @@ storage:
enable-multicontainer: {{ .Values.storage.enableMultiContainer }}
limits:
maxDownloadMBs: {{ .Values.storage.limits.maxDownloadMBs }}
{{- end }}
{{- end }}
3 changes: 3 additions & 0 deletions charts/flyte-core/values-eks.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ userSettings:
dbPassword: <DB_PASSWORD>
rdsHost: <RDS_HOST>
bucketName: <BUCKET_NAME>
userBucketName: <RAW_DATA_BUCKET_NAME>
logGroup: <LOG_GROUP_NAME>
redisHostUrl: <REDIS_HOST_URL>
redisHostKey: <REDIS_HOST_KEY>
Expand Down Expand Up @@ -186,6 +187,8 @@ storage:
type: s3
# -- bucketName defines the storage bucket flyte will use. Required for all types except for sandbox.
bucketName: "{{ .Values.userSettings.bucketName }}"
# -- userBucketName can be the same as bucketName. It defines the bucket that Flyte will use to store Raw data generated by Tasks (Inputs/Outputs)
userBucketName: "{{ .Values.userSettings.userBucketName }}"
s3:
region: "{{ .Values.userSettings.accountRegion }}"

Expand Down
5 changes: 4 additions & 1 deletion charts/flyte-core/values-gcp.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ userSettings:
dbHost: <CLOUD-SQL-IP>
dbPassword: <DBPASSWORD>
bucketName: <BUCKETNAME>
userBucketName: <RAWBUCKETNAME>
hostName: <HOSTNAME>

#
Expand Down Expand Up @@ -196,8 +197,10 @@ common:
storage:
# -- Sets the storage type. Supported values are sandbox, s3, gcs and custom.
type: gcs
# -- bucketName defines the storage bucket flyte will use. Required for all types except for sandbox.
# -- bucketName defines the storage bucket Flyte will use to store task metadata. Required for all types except for sandbox.
bucketName: "{{ .Values.userSettings.bucketName }}"
# -- userBucketName can be the same as bucketName. It defines the bucket that Flyte will use to store Raw data generated by Tasks (Inputs/Outputs)
userBucketName: "{{ .Values.userSettings.userBucketName }}"
# -- settings for storage type s3
gcs:
# -- GCP project ID. Required for storage type gcs.
Expand Down
6 changes: 4 additions & 2 deletions charts/flyte-core/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -442,8 +442,11 @@ common:
storage:
# -- Sets the storage type. Supported values are sandbox, s3, gcs and custom.
type: sandbox
# -- bucketName defines the storage bucket flyte will use. Required for all types except for sandbox.
# -- bucketName defines the storage bucket Flyte will use. Required for all types except for sandbox.
bucketName: my-s3-bucket
#-- userBucketName is a required setting and defines a separate bucket to be used by Flyte to store all the Raw data produced by Tasks
#-- Use the same value as for bucketName to store metadata and raw data in the same bucket.
userBucketName: my-s3-bucket
# -- settings for storage type s3
s3:
region: us-east-1
Expand Down Expand Up @@ -666,7 +669,6 @@ configmap:
pod-template-name: "flytepropeller-template"
# -- follows the structure specified [here](https://pkg.go.dev/github.com/flyteorg/flytepropeller/pkg/controller/config).
propeller:
rawoutput-prefix: s3://my-s3-bucket/
metadata-prefix: metadata/propeller
workers: 4
max-workflow-retries: 30
Expand Down
14 changes: 10 additions & 4 deletions deployment/eks/flyte_aws_scheduler_helm_generated.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -179,6 +179,8 @@ data:
signedUrls:
durationMinutes: 3
storage.yaml: |
propeller:
rawoutput-prefix: s3://<RAW_DATA_BUCKET_NAME>/data
storage:
type: s3
container: "<BUCKET_NAME>"
Expand Down Expand Up @@ -388,6 +390,8 @@ data:
profiler-port: 10254
storage-prefix: metadata/datacatalog
storage.yaml: |
propeller:
rawoutput-prefix: s3://<RAW_DATA_BUCKET_NAME>/data
storage:
type: s3
container: "<BUCKET_NAME>"
Expand Down Expand Up @@ -500,6 +504,8 @@ data:
resourcemanager:
type: noop
storage.yaml: |
propeller:
rawoutput-prefix: s3://<RAW_DATA_BUCKET_NAME>/data
storage:
type: s3
container: "<BUCKET_NAME>"
Expand Down Expand Up @@ -847,7 +853,7 @@ spec:
template:
metadata:
annotations:
configChecksum: "2b5c85969f2bd85bb51a084f9fd72c20c3aca94be99e53cb4c4e9f78e77ebc5"
configChecksum: "4d5a990a82cbe032ea8d812a11455c556f0ffd5758f76de1754017c3d512e29"
labels:
app.kubernetes.io/name: flyteadmin
app.kubernetes.io/instance: flyte
Expand Down Expand Up @@ -1135,7 +1141,7 @@ spec:
template:
metadata:
annotations:
configChecksum: "59ef5b555bd41c3e854a315f21031c76dfa876455ff8069b989cb6c28ec1f17"
configChecksum: "8a96b2f36f440c1c0ecda8d41fe1ebf1acc9acb645e902553d20063cef472b6"
labels:
app.kubernetes.io/name: datacatalog
app.kubernetes.io/instance: flyte
Expand Down Expand Up @@ -1226,7 +1232,7 @@ spec:
template:
metadata:
annotations:
configChecksum: "df1663080e3f9c7f97035ff969c5c5ea649f23c071e2e473c7c1513d0d5d9b4"
configChecksum: "8c42ded7052058802087b7025eea4d26ef8cbb0fa591d929c703335c4688e02"
labels:
app.kubernetes.io/name: flytepropeller
app.kubernetes.io/instance: flyte
Expand Down Expand Up @@ -1308,7 +1314,7 @@ spec:
app.kubernetes.io/name: flyte-pod-webhook
app.kubernetes.io/version: v1.10.7-b0
annotations:
configChecksum: "df1663080e3f9c7f97035ff969c5c5ea649f23c071e2e473c7c1513d0d5d9b4"
configChecksum: "8c42ded7052058802087b7025eea4d26ef8cbb0fa591d929c703335c4688e02"
spec:
securityContext:
fsGroup: 65534
Expand Down
10 changes: 7 additions & 3 deletions deployment/eks/flyte_helm_controlplane_generated.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -160,6 +160,8 @@ data:
signedUrls:
durationMinutes: 3
storage.yaml: |
propeller:
rawoutput-prefix: s3://<RAW_DATA_BUCKET_NAME>/data
storage:
type: s3
container: "<BUCKET_NAME>"
Expand Down Expand Up @@ -354,6 +356,8 @@ data:
profiler-port: 10254
storage-prefix: metadata/datacatalog
storage.yaml: |
propeller:
rawoutput-prefix: s3://<RAW_DATA_BUCKET_NAME>/data
storage:
type: s3
container: "<BUCKET_NAME>"
Expand Down Expand Up @@ -553,7 +557,7 @@ spec:
template:
metadata:
annotations:
configChecksum: "053b20ebc40227f6ed8ddc61f5997ee7997c604158f773779f20ec61af11a2f"
configChecksum: "a66378327daf755a8cea8f042c9985b075f4ff1b7d41035f441301083ae3355"
labels:
app.kubernetes.io/name: flyteadmin
app.kubernetes.io/instance: flyte
Expand Down Expand Up @@ -841,7 +845,7 @@ spec:
template:
metadata:
annotations:
configChecksum: "59ef5b555bd41c3e854a315f21031c76dfa876455ff8069b989cb6c28ec1f17"
configChecksum: "8a96b2f36f440c1c0ecda8d41fe1ebf1acc9acb645e902553d20063cef472b6"
labels:
app.kubernetes.io/name: datacatalog
app.kubernetes.io/instance: flyte
Expand Down Expand Up @@ -932,7 +936,7 @@ spec:
template:
metadata:
annotations:
configChecksum: "053b20ebc40227f6ed8ddc61f5997ee7997c604158f773779f20ec61af11a2f"
configChecksum: "a66378327daf755a8cea8f042c9985b075f4ff1b7d41035f441301083ae3355"
labels:
app.kubernetes.io/name: flytescheduler
app.kubernetes.io/instance: flyte
Expand Down
6 changes: 4 additions & 2 deletions deployment/eks/flyte_helm_dataplane_generated.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -165,6 +165,8 @@ data:
resourcemanager:
type: noop
storage.yaml: |
propeller:
rawoutput-prefix: s3://<RAW_DATA_BUCKET_NAME>/data
storage:
type: s3
container: "<BUCKET_NAME>"
Expand Down Expand Up @@ -427,7 +429,7 @@ spec:
template:
metadata:
annotations:
configChecksum: "df1663080e3f9c7f97035ff969c5c5ea649f23c071e2e473c7c1513d0d5d9b4"
configChecksum: "8c42ded7052058802087b7025eea4d26ef8cbb0fa591d929c703335c4688e02"
labels:
app.kubernetes.io/name: flytepropeller
app.kubernetes.io/instance: flyte
Expand Down Expand Up @@ -509,7 +511,7 @@ spec:
app.kubernetes.io/name: flyte-pod-webhook
app.kubernetes.io/version: v1.10.7-b0
annotations:
configChecksum: "df1663080e3f9c7f97035ff969c5c5ea649f23c071e2e473c7c1513d0d5d9b4"
configChecksum: "8c42ded7052058802087b7025eea4d26ef8cbb0fa591d929c703335c4688e02"
spec:
securityContext:
fsGroup: 65534
Expand Down
Loading
Loading