-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhancing user experience for humaneval benchmark #65
Conversation
cbda7cf
to
ca9a6b2
Compare
human-eval-benchmark/Readme.md
Outdated
@@ -21,6 +21,11 @@ Towards the end of the script B the user is prompted to fill the run duration, w | |||
|
|||
## 2. Automated Job | |||
|
|||
In this approach we already have a combined script named `script.py`, and a Docker file which is used to create this docker image `quay.io/kruizehub/human-eval-deployment` which is used in the `job.yaml` file. | |||
To run the benchmark in an automated way the user simply needs to login to the relevent Openshift AI cluster, create a namespace or you can use the default namespace. Set your desired environment variable in `job.yaml`, you have number of prompts or duration to choose from. If num_prompts and duration_in_seconds both are set num_prompts has a higher precedence. Apply `pcv.yaml` followed by applying `job.yaml`. This would deploy the humaneval benchamrk in the specified namespace. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo for pvc.yaml
|
||
JOB_YAML=./manifests/job.yaml | ||
JOB_NAME=human-eval-deployment-job | ||
NAMESPACE=default |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you provide an option to change the namespace as parameter. By default, the namespace can be 'default'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kusumachalasani added it!
490bd3f
to
d59c30f
Compare
60d5dfd
to
0a73399
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
6ec5f96
to
bad7ce5
Compare
This pr adds a Cleanup script, a Run script, enables user to provide input in terms of duration for how long it wants to keep the workload running. This has been tested on the nerc cluster.
This is how the user experience would look like!