You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
remote storage of the output of the wrapper is also S3
prerequisites
install databricks-CLI
configure the databricks-CLI (token, workspace, profile)
install and configure aws-CLI
Running
command to run
# note that aws_key_id_name and aws_key_secret_name are not the actual keys
spark_rapids_user_tools databricks-aws qualification \
--profile=<my-db-profile> \
--cpu_cluster=<my-cluster_name/cpu_file_properties> \
--gpu_cluster=<my-cluster_name/gpu_file_properties> \
--remote_folder=<S3://> \
--eventlogs <S3://>,<S3://>,<S3://>
Required options:
cpu_cluster
eventlogs required if the S3 path cannot be pulled from the cluster properties
The wrapper will download the dependencies "spark-jars, aws-sdk-jar, java-sdk-aws-bundle-jar"
The text was updated successfully, but these errors were encountered:
Final proposal for the first iteration
Assumptions
No DBFS paths
prerequisites
Running
Required options:
cpu_cluster
eventlogs
required if the S3 path cannot be pulled from the cluster propertiesThe wrapper will download the dependencies "spark-jars, aws-sdk-jar, java-sdk-aws-bundle-jar"
The text was updated successfully, but these errors were encountered: