Remove excessive experiment configs, add example hyperparameter table…

… and a script, update readme
vadim0x60 · Jan 10, 2024 · 20dd6a1 · 20dd6a1
1 parent 67ab423
commit 20dd6a1
Show file tree

Hide file tree

Showing 16 changed files with 139 additions and 8,455 deletions.
diff --git a/.gitignore b/.gitignore
@@ -30,6 +30,7 @@ solutions
 # Slurm
 scripts/
 create_scripts.py
+config/
 config/bf_experiments/
 
 poetry.lock

diff --git a/README.md b/README.md
@@ -14,33 +14,68 @@ help(develop)
 
 The experiments reported in [the blog post](https://vadim.me/posts/unreasonable) and in the upcoming paper are contained in `benchmark.py` file. When you run this file, the AI-generated programs are commited to a dedicated github repository, while the metrics (i.e. how many tests every program passes) will be logged in your [Weights and Biases](https://wandb.ai)
 
-### Set up Weights and Biases
+### Prerequisites 
+#### Set up Weights and Biases
 
 1. Create an account on [Weights and Biases](https://wandb.ai)
 2. Install the [Weights and Biases](https://docs.wandb.com/library/install) library
 3. Run `wandb login` and follow the instructions
 
-### Set up a github repository
+#### Set up a GitHub repository
 
 1. Go to [github](https://github.com), log in to the account that's going to push AI-generated code. Remember the $username and $email for that account.
 2. Go [here](https://github.com/settings/tokens?type=beta) and generate an access $token
-3. Set `GITHUB_USER` to "Bot" or whatever the name of the committer shall be
-4. Set `GITHUB_EMAIL` to $email
-5. Set `GITHUB_REMOTE` to https://$username:$[email protected]/$repo
+3. Set `GIT_USER` to "Bot" or whatever the name of the committer shall be
+4. Set `GIT_EMAIL` to $email
+5. Set `GIT_REMOTE` to https://$username:$[email protected]/$repo
 
-Don't be fooled by the variable names, you can of course use a non-github git hosting.
+Note that you can use a non-GitHub git hosting.
 
-### Set up OpenAI access
+#### Set up OpenAI access
 
-It's 2022 and the language model inference happens in the cloud.
-You are going to need an OpenAI account with access to `code-davinci-001` and `code-davinci-edit-001`
+OpenAI account is needed with access to `gpt-3.5-turbo`
 Set `OPENAI_API_KEY` environment variable to your access token.
 
+Example `.config` file layout:
+```bash
+# Github
+export GIT_REMOTE=https://USERNAME:[email protected]/SOLUTIONS_REPO
+export GIT_USER=...
+export GIT_EMAIL=...
+
+# Data
+export DATA_PATH=...
+
+# OpenAI
+export OPENAI_API_KEY=...
+export OPENAI_ORG=...
+
+# WandB
+export WANDB_ENTITY=...
+export WANDB_DIR=...
+```
+
 ### Run the experiments
 
-If you're using [slurm](https://slurm.schedmd.com/), write a `run.sh` file with `python benchmark.py` and run it with `sbatch run.sh --array=0-191`.
-If not, run `TASK_ID=n python benchmark.py` to re-run one of our 192 experiments exactly, or set the parameters yourself:
+If you're using [Slurm](https://slurm.schedmd.com/), write a `run.sh` file with `python benchmark.py` 
+and run it with `sbatch run.sh --array=1-500`.
+If not, run `TASK_ID=n python benchmark.py` to re-run one of our experiments exactly, or set the parameters yourself:
 
+For example, for basement problem in PSB2, run SEIDR without lexicase selection as follows:
 ```
-python benchmark.py --branching-factor 200 --language C++ --problem fizz-buzz
+python3 benchmark.py \
+    --task_id 202 \
+    --problem basement \
+    --language C++ \
+    --max_programs 100 \
+    --drafts_per_prompt 2 \
+    --explanations_per_program 2 \
+    --repairs_per_explanation 2 \
+    --beam_width 2 \
+    --log INFO \
+    --lexicase_selection False \
+    --dataset psb2 \
+    --model_name gpt-3.5-turbo
 ```
+
+Example Slurm scripts are stored in `example_scripts/` and tables with hyperparameters in `/config`