-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Remove excessive experiment configs, add example hyperparameter table…
… and a script, update readme
- Loading branch information
Anastasiia Grishina
authored and
Anastasiia Grishina
committed
Jan 10, 2024
1 parent
67ab423
commit 20dd6a1
Showing
16 changed files
with
139 additions
and
8,455 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -30,6 +30,7 @@ solutions | |
# Slurm | ||
scripts/ | ||
create_scripts.py | ||
config/ | ||
config/bf_experiments/ | ||
|
||
poetry.lock | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -14,33 +14,68 @@ help(develop) | |
|
||
The experiments reported in [the blog post](https://vadim.me/posts/unreasonable) and in the upcoming paper are contained in `benchmark.py` file. When you run this file, the AI-generated programs are commited to a dedicated github repository, while the metrics (i.e. how many tests every program passes) will be logged in your [Weights and Biases](https://wandb.ai) | ||
|
||
### Set up Weights and Biases | ||
### Prerequisites | ||
#### Set up Weights and Biases | ||
|
||
1. Create an account on [Weights and Biases](https://wandb.ai) | ||
2. Install the [Weights and Biases](https://docs.wandb.com/library/install) library | ||
3. Run `wandb login` and follow the instructions | ||
|
||
### Set up a github repository | ||
#### Set up a GitHub repository | ||
|
||
1. Go to [github](https://github.com), log in to the account that's going to push AI-generated code. Remember the $username and $email for that account. | ||
2. Go [here](https://github.com/settings/tokens?type=beta) and generate an access $token | ||
3. Set `GITHUB_USER` to "Bot" or whatever the name of the committer shall be | ||
4. Set `GITHUB_EMAIL` to $email | ||
5. Set `GITHUB_REMOTE` to https://$username:$[email protected]/$repo | ||
3. Set `GIT_USER` to "Bot" or whatever the name of the committer shall be | ||
4. Set `GIT_EMAIL` to $email | ||
5. Set `GIT_REMOTE` to https://$username:$[email protected]/$repo | ||
|
||
Don't be fooled by the variable names, you can of course use a non-github git hosting. | ||
Note that you can use a non-GitHub git hosting. | ||
|
||
### Set up OpenAI access | ||
#### Set up OpenAI access | ||
|
||
It's 2022 and the language model inference happens in the cloud. | ||
You are going to need an OpenAI account with access to `code-davinci-001` and `code-davinci-edit-001` | ||
OpenAI account is needed with access to `gpt-3.5-turbo` | ||
Set `OPENAI_API_KEY` environment variable to your access token. | ||
|
||
Example `.config` file layout: | ||
```bash | ||
# Github | ||
export GIT_REMOTE=https://USERNAME:[email protected]/SOLUTIONS_REPO | ||
export GIT_USER=... | ||
export GIT_EMAIL=... | ||
|
||
# Data | ||
export DATA_PATH=... | ||
|
||
# OpenAI | ||
export OPENAI_API_KEY=... | ||
export OPENAI_ORG=... | ||
|
||
# WandB | ||
export WANDB_ENTITY=... | ||
export WANDB_DIR=... | ||
``` | ||
|
||
### Run the experiments | ||
|
||
If you're using [slurm](https://slurm.schedmd.com/), write a `run.sh` file with `python benchmark.py` and run it with `sbatch run.sh --array=0-191`. | ||
If not, run `TASK_ID=n python benchmark.py` to re-run one of our 192 experiments exactly, or set the parameters yourself: | ||
If you're using [Slurm](https://slurm.schedmd.com/), write a `run.sh` file with `python benchmark.py` | ||
and run it with `sbatch run.sh --array=1-500`. | ||
If not, run `TASK_ID=n python benchmark.py` to re-run one of our experiments exactly, or set the parameters yourself: | ||
|
||
For example, for basement problem in PSB2, run SEIDR without lexicase selection as follows: | ||
``` | ||
python benchmark.py --branching-factor 200 --language C++ --problem fizz-buzz | ||
python3 benchmark.py \ | ||
--task_id 202 \ | ||
--problem basement \ | ||
--language C++ \ | ||
--max_programs 100 \ | ||
--drafts_per_prompt 2 \ | ||
--explanations_per_program 2 \ | ||
--repairs_per_explanation 2 \ | ||
--beam_width 2 \ | ||
--log INFO \ | ||
--lexicase_selection False \ | ||
--dataset psb2 \ | ||
--model_name gpt-3.5-turbo | ||
``` | ||
|
||
Example Slurm scripts are stored in `example_scripts/` and tables with hyperparameters in `/config` |
Oops, something went wrong.