Skip to content

Commit

Permalink
Separate BenchBuilder requirements
Browse files Browse the repository at this point in the history
  • Loading branch information
BabyChouSr committed Sep 2, 2024
1 parent 869283e commit 451affa
Show file tree
Hide file tree
Showing 3 changed files with 19 additions and 3 deletions.
6 changes: 6 additions & 0 deletions BenchBuilder/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,12 @@ Checkout our [paper](https://arxiv.org/abs/2406.11939) for more details.

BenchBuilder employs a two stage pipeline.

First, install the BenchBuilder dependencies:
```console
cd BenchBuilder
pip install -r requirements.txt
```

Step 1: annotate the prompt using GPT-3.5-Turbo and filter prompts which either have a score < 5 or belong to a topic cluster with a mean score < 3. This serves as a cheap and first pass through to remove any low quality prompts and clusters before further curation.

Step 2: use GPT-4-Turbo to annotate the remaining prompts, then extract prompts with quality score of >= 6 and belong to a topic cluster with mean quality score >= 6, ensuring only high-quality prompts are selected with minimal false positives.
Expand Down
12 changes: 12 additions & 0 deletions BenchBuilder/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
tiktoken
openai
numpy
pandas
shortuuid
tqdm
gradio==3.40.0
httpx==0.25.2
plotly
scikit-learn
bertopic[spacy]
torch
4 changes: 1 addition & 3 deletions requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,4 @@ tqdm
gradio==3.40.0
httpx==0.25.2
plotly
scikit-learn
bertopic[spacy]
torch
scikit-learn

0 comments on commit 451affa

Please sign in to comment.