From 0a8de5ec9dba76a130449222eb5634b5c37a0ad7 Mon Sep 17 00:00:00 2001 From: Praveen Selvaraj Date: Mon, 23 Dec 2024 01:34:54 +0000 Subject: [PATCH] Update tutorial.qmd --- docs/tutorial.qmd | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/tutorial.qmd b/docs/tutorial.qmd index 2922b0125..48995546d 100644 --- a/docs/tutorial.qmd +++ b/docs/tutorial.qmd @@ -78,7 +78,7 @@ inspect eval security_guide.py [HellaSwag](https://rowanzellers.com/hellaswag/) is a dataset designed to test commonsense natural language inference (NLI) about physical situations. It includes samples that are adversarially constructed to violate common sense about the physical world, so can be a challenge for some language models. -For example, here is one of the questions in the dataset along with its set of possible answer (the correct answer is C): +For example, here is one of the questions in the dataset along with its set of possible answers (the correct answer is C): > In home pet groomers demonstrate how to groom a pet. the person > @@ -570,4 +570,4 @@ def ctf_agent(max_attempts=3, message_limit=30): The `basic_agent()` provides a ReAct tool loop with support for retries and encouraging the model to continue if its gives up or gets stuck. The `bash()` and `python()` tools are provided to the model with a 3-minute timeout to prevent long running commands from getting the evaluation stuck. -See the [full source code](https://github.com/UKGovernmentBEIS/inspect_evals/tree/main/src/inspect_evals/gdm_capabilities/intercode_ctf) of the Intercode CTF example to explore the dataset and evaluation code in more depth. \ No newline at end of file +See the [full source code](https://github.com/UKGovernmentBEIS/inspect_evals/tree/main/src/inspect_evals/gdm_capabilities/intercode_ctf) of the Intercode CTF example to explore the dataset and evaluation code in more depth.