Skip to content

Commit

Permalink
remove random select in tooleval
Browse files Browse the repository at this point in the history
  • Loading branch information
pooruss committed Nov 17, 2023
1 parent b062b2c commit fbad8a0
Show file tree
Hide file tree
Showing 5 changed files with 7 additions and 10 deletions.
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -577,7 +577,7 @@ python eval_pass_rate.py \
--reference_model ${CANDIDATE_MODEL} \
--test_ids ../../data/test_ids/ \
--max_eval_threads 20 \
--evaluate_times 4
--evaluate_times 7

```
The result files will be stored under the ${SAVE_PATH}.
Expand All @@ -600,7 +600,7 @@ python eval_preference.py \
--pass_rate_result_path ${PASS_TARE_PATH} \
--max_eval_threads 20 \
--use_pass_rate true \
--evaluate_times 4
--evaluate_times 7
```
The result files will be stored under the ${SAVE_PATH}.

Expand Down
4 changes: 2 additions & 2 deletions README_ZH.md
Original file line number Diff line number Diff line change
Expand Up @@ -585,7 +585,7 @@ python eval_pass_rate.py \
--reference_model ${CANDIDATE_MODEL} \
--test_ids ../../data/test_query_ids/ \
--max_eval_threads 20 \
--evaluate_times 4
--evaluate_times 7

```

Expand All @@ -609,7 +609,7 @@ python eval_preference.py \
--pass_rate_result_path ${PASS_TARE_PATH} \
--max_eval_threads 20 \
--use_pass_rate true \
--evaluate_times 4
--evaluate_times 7
```

结果文件会被存储至${SAVE_PATH}中。
Expand Down
5 changes: 1 addition & 4 deletions toolbench/tooleval/eval_pass_rate.py
Original file line number Diff line number Diff line change
Expand Up @@ -170,11 +170,8 @@ def compute_pass_rate(query_id, example):
write_results(filename, reference_model, label_cnt)
pass_rate = 0
for query_id in label_cnt:
if label_cnt[query_id]["failed"] < label_cnt[query_id]["passed"]:
if label_cnt[query_id]["failed"] <= label_cnt[query_id]["passed"]:
pass_rate += 1
elif label_cnt[query_id]["failed"] == label_cnt[query_id]["passed"]:
if random.random() < 0.5:
pass_rate += 1
pass_rate /= len(label_cnt)
print(f"Test set: {test_set}. Model: {reference_model}. Pass rate: {str(pass_rate)}")

Expand Down
2 changes: 1 addition & 1 deletion toolbench/tooleval/run_pass_rate.sh
Original file line number Diff line number Diff line change
Expand Up @@ -9,4 +9,4 @@ python eval_pass_rate.py \
--reference_model ${CANDIDATE_MODEL} \
--test_ids ../../data/test_query_ids/ \
--max_eval_threads 20 \
--evaluate_times 4
--evaluate_times 7
2 changes: 1 addition & 1 deletion toolbench/tooleval/run_preference.sh
Original file line number Diff line number Diff line change
Expand Up @@ -14,4 +14,4 @@ python eval_preference.py \
--pass_rate_result_path ${PASS_TARE_PATH} \
--max_eval_threads 20 \
--use_pass_rate true \
--evaluate_times 4
--evaluate_times 7

0 comments on commit fbad8a0

Please sign in to comment.