Skip to content

Commit

Permalink
Built site for gh-pages
Browse files Browse the repository at this point in the history
  • Loading branch information
jjallaire committed Sep 13, 2024
1 parent 1804e8b commit 2b8c74c
Show file tree
Hide file tree
Showing 27 changed files with 328 additions and 235 deletions.
2 changes: 1 addition & 1 deletion .nojekyll
Original file line number Diff line number Diff line change
@@ -1 +1 @@
6239bfc1
ef1b3979
4 changes: 2 additions & 2 deletions agents-api.html
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,12 @@
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"><head>

<meta charset="utf-8">
<meta name="generator" content="quarto-1.5.32">
<meta name="generator" content="quarto-1.5.57">

<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes">


<title>Inspect – Agents API</title>
<title>Agents API – Inspect</title>
<style>
code{white-space: pre-wrap;}
span.smallcaps{font-variant: small-caps;}
Expand Down
10 changes: 5 additions & 5 deletions agents.html
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,12 @@
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"><head>

<meta charset="utf-8">
<meta name="generator" content="quarto-1.5.32">
<meta name="generator" content="quarto-1.5.57">

<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes">


<title>InspectAgents</title>
<title>AgentsInspect</title>
<style>
code{white-space: pre-wrap;}
span.smallcaps{font-variant: small-caps;}
Expand Down Expand Up @@ -442,7 +442,7 @@ <h3 class="anchored" data-anchor-id="example">Example</h3>
<section id="options" class="level3">
<h3 class="anchored" data-anchor-id="options">Options</h3>
<p>There are several options available for customising the behaviour of the basic agent:</p>
<table class="table">
<table class="caption-top table">
<colgroup>
<col style="width: 23%">
<col style="width: 20%">
Expand Down Expand Up @@ -830,7 +830,7 @@ <h3 class="anchored" data-anchor-id="environment-interface">Environment Interfac
<section id="environment-binding" class="level3">
<h3 class="anchored" data-anchor-id="environment-binding">Environment Binding</h3>
<p>There are two sandbox environments built in to Inspect:</p>
<table class="table">
<table class="caption-top table">
<thead>
<tr class="header">
<th>Environment Type</th>
Expand Down Expand Up @@ -889,7 +889,7 @@ <h3 class="anchored" data-anchor-id="sec-docker-configuration">Docker Configurat
<p>Before using Docker sandbox environments, please be sure to install <a href="https://docs.docker.com/engine/install/">Docker Engine</a> (version 24.0.7 or greater).</p>
<p>You can use the Docker sandbox enviornment without any special configuration, however most commonly you’ll provide explicit configuration via either a <code>Dockerfile</code> or a <a href="https://docs.docker.com/compose/compose-file/">Docker Compose</a> configuration file (<code>compose.yaml</code>).</p>
<p>Here is how Docker sandbox environments are created based on the presence of <code>Dockerfile</code> and/or <code>compose.yml</code> in the task directory:</p>
<table class="table">
<table class="caption-top table">
<thead>
<tr class="header">
<th>Config Files</th>
Expand Down
6 changes: 3 additions & 3 deletions caching.html
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,12 @@
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"><head>

<meta charset="utf-8">
<meta name="generator" content="quarto-1.5.32">
<meta name="generator" content="quarto-1.5.57">

<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes">


<title>InspectCaching</title>
<title>CachingInspect</title>
<style>
code{white-space: pre-wrap;}
span.smallcaps{font-variant: small-caps;}
Expand Down Expand Up @@ -449,7 +449,7 @@ <h3 class="anchored" data-anchor-id="usage-reporting">Usage Reporting</h3>
<p>When using provider caching, model token usage will be reported with 4 distinct values rather than the normal input and output. For example:</p>
<div class="sourceCode" id="cb13"><pre class="sourceCode default code-with-copy"><code class="sourceCode default"><span id="cb13-1"><a href="#cb13-1" aria-hidden="true" tabindex="-1"></a>13,684 tokens [I: 22, CW: 1,711, CR: 11,442, O: 509]</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Where the prefixes on reported token counts stand for:</p>
<table class="table">
<table class="caption-top table">
<tbody>
<tr class="odd">
<td><strong>I</strong></td>
Expand Down
10 changes: 5 additions & 5 deletions datasets.html
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,12 @@
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"><head>

<meta charset="utf-8">
<meta name="generator" content="quarto-1.5.32">
<meta name="generator" content="quarto-1.5.57">

<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes">


<title>InspectDatasets</title>
<title>DatasetsInspect</title>
<style>
code{white-space: pre-wrap;}
span.smallcaps{font-variant: small-caps;}
Expand Down Expand Up @@ -347,7 +347,7 @@ <h2 class="anchored" data-anchor-id="overview">Overview</h2>
<h2 class="anchored" data-anchor-id="dataset-samples">Dataset Samples</h2>
<p>The core data type underlying the use of datasets with Inspect is the <code>Sample</code>, which consists of a required <code>input</code> field and several other optional fields:</p>
<p><strong>Class</strong> <code>inspect_ai.dataset.Sample</code></p>
<table class="table">
<table class="caption-top table">
<colgroup>
<col style="width: 20%">
<col style="width: 40%">
Expand Down Expand Up @@ -404,7 +404,7 @@ <h2 class="anchored" data-anchor-id="dataset-samples">Dataset Samples</h2>
</tbody>
</table>
<p>So a CSV dataset with the following structure:</p>
<table class="table">
<table class="caption-top table">
<colgroup>
<col style="width: 56%">
<col style="width: 43%">
Expand Down Expand Up @@ -512,7 +512,7 @@ <h2 class="anchored" data-anchor-id="amazon-s3">Amazon S3</h2>
<h2 class="anchored" data-anchor-id="chat-messages">Chat Messages</h2>
<p>The most important data structure within <code>Sample</code> is the <code>ChatMessage</code>. Note that often datasets will contain a simple string as their input (which is then internally converted to a <code>ChatMessageUser</code>). However, it is possible to include a full message history as the input via <code>ChatMessage</code>. Another useful application of <code>ChatMessage</code> is providing multi-modal input (e.g.&nbsp;images).</p>
<p><strong>Class</strong> <code>inspect_ai.model.ChatMessage</code></p>
<table class="table">
<table class="caption-top table">
<colgroup>
<col style="width: 10%">
<col style="width: 35%">
Expand Down
6 changes: 3 additions & 3 deletions errors-and-limits.html
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,12 @@
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"><head>

<meta charset="utf-8">
<meta name="generator" content="quarto-1.5.32">
<meta name="generator" content="quarto-1.5.57">

<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes">


<title>Inspect – Errors and Limits</title>
<title>Errors and Limits – Inspect</title>
<style>
code{white-space: pre-wrap;}
span.smallcaps{font-variant: small-caps;}
Expand Down Expand Up @@ -371,7 +371,7 @@ <h2 class="anchored" data-anchor-id="failure-threshold">Failure Threshold</h2>
<span id="cb4-13"><a href="#cb4-13" aria-hidden="true" tabindex="-1"></a> )</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>Failed samples are <em>not scored</em> and a warning indicating that some samples failed is both printed in the terminal and shown in Inspect View when this occurs.</p>
<p>You can specify <code>fail_on_error</code> as a boolean (turning the behaviour on and off entirely), as a number between 0 and 1 (indicating a proportion of failures to tolerate), or a number greater than 1 to (indicating a count of failures to tolerate):</p>
<table class="table">
<table class="caption-top table">
<colgroup>
<col style="width: 40%">
<col style="width: 60%">
Expand Down
59 changes: 33 additions & 26 deletions eval-logs.html
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,12 @@
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"><head>

<meta charset="utf-8">
<meta name="generator" content="quarto-1.5.32">
<meta name="generator" content="quarto-1.5.57">

<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes">


<title>Inspect – Eval Logs</title>
<title>Eval Logs – Inspect</title>
<style>
code{white-space: pre-wrap;}
span.smallcaps{font-variant: small-caps;}
Expand Down Expand Up @@ -378,7 +378,7 @@ <h2 class="anchored" data-anchor-id="log-location">Log Location</h2>
<h2 class="anchored" data-anchor-id="evallog">EvalLog</h2>
<p>The <code>EvalLog</code> object returned from <code>eval()</code> provides programmatic interface to the contents of log files:</p>
<p><strong>Class</strong> <code>inspect_ai.log.EvalLog</code></p>
<table class="table">
<table class="caption-top table">
<colgroup>
<col style="width: 25%">
<col style="width: 27%">
Expand Down Expand Up @@ -440,7 +440,7 @@ <h2 class="anchored" data-anchor-id="evallog">EvalLog</h2>
<span id="cb6-3"><a href="#cb6-3" aria-hidden="true" tabindex="-1"></a> ...</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>In the section below we’ll talk more about how to deal with logs from failed evaluations (e.g.&nbsp;retrying the eval).</p>
<p>You can enumerate, read, and write <code>EvalLog</code> objects using the following helper functions from the <code>inspect_ai.log</code> module:</p>
<table class="table">
<table class="caption-top table">
<colgroup>
<col style="width: 43%">
<col style="width: 56%">
Expand Down Expand Up @@ -1102,30 +1102,37 @@ <h3 class="anchored" data-anchor-id="reading-logs">Reading Logs</h3>
</div>
</div>
</footer>
<script>var lightboxQuarto = GLightbox({"openEffect":"zoom","descPosition":"bottom","closeEffect":"zoom","loop":false,"selector":".lightbox"});
window.onload = () => {
lightboxQuarto.on('slide_before_load', (data) => {
const { slideIndex, slideNode, slideConfig, player, trigger } = data;
const href = trigger.getAttribute('href');
if (href !== null) {
const imgEl = window.document.querySelector(`a[href="${href}"] img`);
if (imgEl !== null) {
const srcAttr = imgEl.getAttribute("src");
if (srcAttr && srcAttr.startsWith("data:")) {
slideConfig.href = srcAttr;
<script>var lightboxQuarto = GLightbox({"selector":".lightbox","openEffect":"zoom","loop":false,"descPosition":"bottom","closeEffect":"zoom"});
(function() {
let previousOnload = window.onload;
window.onload = () => {
if (previousOnload) {
previousOnload();
}
lightboxQuarto.on('slide_before_load', (data) => {
const { slideIndex, slideNode, slideConfig, player, trigger } = data;
const href = trigger.getAttribute('href');
if (href !== null) {
const imgEl = window.document.querySelector(`a[href="${href}"] img`);
if (imgEl !== null) {
const srcAttr = imgEl.getAttribute("src");
if (srcAttr && srcAttr.startsWith("data:")) {
slideConfig.href = srcAttr;
}
}
}
});

lightboxQuarto.on('slide_after_load', (data) => {
const { slideIndex, slideNode, slideConfig, player, trigger } = data;
if (window.Quarto?.typesetMath) {
window.Quarto.typesetMath(slideNode);
}
}
});

lightboxQuarto.on('slide_after_load', (data) => {
const { slideIndex, slideNode, slideConfig, player, trigger } = data;
if (window.Quarto?.typesetMath) {
window.Quarto.typesetMath(slideNode);
}
});

};
});

};

})();
</script>


Expand Down
6 changes: 3 additions & 3 deletions eval-sets.html
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,12 @@
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"><head>

<meta charset="utf-8">
<meta name="generator" content="quarto-1.5.32">
<meta name="generator" content="quarto-1.5.57">

<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes">


<title>Inspect – Eval Sets</title>
<title>Eval Sets – Inspect</title>
<style>
code{white-space: pre-wrap;}
span.smallcaps{font-variant: small-caps;}
Expand Down Expand Up @@ -400,7 +400,7 @@ <h3 class="anchored" data-anchor-id="dynamic-tasks">Dynamic Tasks</h3>
<section id="options" class="level3">
<h3 class="anchored" data-anchor-id="options">Options</h3>
<p>There are a number of options that control the retry behaviour of eval sets:</p>
<table class="table">
<table class="caption-top table">
<colgroup>
<col style="width: 40%">
<col style="width: 60%">
Expand Down
4 changes: 2 additions & 2 deletions examples/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,12 @@
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"><head>

<meta charset="utf-8">
<meta name="generator" content="quarto-1.5.32">
<meta name="generator" content="quarto-1.5.57">

<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes">


<title>InspectExamples</title>
<title>ExamplesInspect</title>
<style>
code{white-space: pre-wrap;}
span.smallcaps{font-variant: small-caps;}
Expand Down
8 changes: 4 additions & 4 deletions extensions.html
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,12 @@
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"><head>

<meta charset="utf-8">
<meta name="generator" content="quarto-1.5.32">
<meta name="generator" content="quarto-1.5.57">

<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes">


<title>InspectExtensions</title>
<title>ExtensionsInspect</title>
<style>
code{white-space: pre-wrap;}
span.smallcaps{font-variant: small-caps;}
Expand Down Expand Up @@ -429,7 +429,7 @@ <h3 class="anchored" data-anchor-id="model-usage">Model Usage</h3>
<section id="sec-sandbox-environment-extensions" class="level2">
<h2 class="anchored" data-anchor-id="sec-sandbox-environment-extensions">Sandbox Environments</h2>
<p><a href="agents.html#sec-sandbox-environments">Sandbox Environments</a> provide a mechanism for sandboxing execution of tool code as well as providing more sophisticated infrastructure (e.g.&nbsp;creating network hosts for a cybersecurity eval). Inspect comes with two sandbox environments built in:</p>
<table class="table">
<table class="caption-top table">
<colgroup>
<col style="width: 38%">
<col style="width: 61%">
Expand Down Expand Up @@ -497,7 +497,7 @@ <h2 class="anchored" data-anchor-id="sec-sandbox-environment-extensions">Sandbox
<span id="cb7-42"><a href="#cb7-42" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb7-43"><a href="#cb7-43" aria-hidden="true" tabindex="-1"></a> <span class="co"># (instance methods shown below)</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>The class methods take care of various stages of initialisation, setup, and teardown:</p>
<table class="table">
<table class="caption-top table">
<colgroup>
<col style="width: 26%">
<col style="width: 26%">
Expand Down
Binary file removed images/toolenv-no-cleanup.png
Binary file not shown.
55 changes: 31 additions & 24 deletions index.html
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"><head>

<meta charset="utf-8">
<meta name="generator" content="quarto-1.5.32">
<meta name="generator" content="quarto-1.5.57">

<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes">

Expand Down Expand Up @@ -395,7 +395,7 @@ <h2 class="anchored" data-anchor-id="sec-hello-inspect">Hello, Inspect</h2>
<li><p><strong>Scorers</strong> evaluate the final output of solvers. They may use text comparisons, model grading, or other custom schemes</p></li>
</ol>
<p>Let’s take a look at a simple evaluation that aims to see how models perform on the <a href="https://en.wikipedia.org/wiki/Sally%E2%80%93Anne_test">Sally-Anne</a> test, which assesses the ability of a person to infer false beliefs in others. Here are some samples from the dataset:</p>
<table class="table">
<table class="caption-top table">
<colgroup>
<col style="width: 62%">
<col style="width: 37%">
Expand Down Expand Up @@ -1026,30 +1026,37 @@ <h2 class="anchored" data-anchor-id="learning-more">Learning More</h2>
</div>
</div>
</footer>
<script>var lightboxQuarto = GLightbox({"loop":false,"descPosition":"bottom","selector":".lightbox","closeEffect":"zoom","openEffect":"zoom"});
window.onload = () => {
lightboxQuarto.on('slide_before_load', (data) => {
const { slideIndex, slideNode, slideConfig, player, trigger } = data;
const href = trigger.getAttribute('href');
if (href !== null) {
const imgEl = window.document.querySelector(`a[href="${href}"] img`);
if (imgEl !== null) {
const srcAttr = imgEl.getAttribute("src");
if (srcAttr && srcAttr.startsWith("data:")) {
slideConfig.href = srcAttr;
<script>var lightboxQuarto = GLightbox({"selector":".lightbox","closeEffect":"zoom","loop":false,"descPosition":"bottom","openEffect":"zoom"});
(function() {
let previousOnload = window.onload;
window.onload = () => {
if (previousOnload) {
previousOnload();
}
lightboxQuarto.on('slide_before_load', (data) => {
const { slideIndex, slideNode, slideConfig, player, trigger } = data;
const href = trigger.getAttribute('href');
if (href !== null) {
const imgEl = window.document.querySelector(`a[href="${href}"] img`);
if (imgEl !== null) {
const srcAttr = imgEl.getAttribute("src");
if (srcAttr && srcAttr.startsWith("data:")) {
slideConfig.href = srcAttr;
}
}
}
});

lightboxQuarto.on('slide_after_load', (data) => {
const { slideIndex, slideNode, slideConfig, player, trigger } = data;
if (window.Quarto?.typesetMath) {
window.Quarto.typesetMath(slideNode);
}
}
});

lightboxQuarto.on('slide_after_load', (data) => {
const { slideIndex, slideNode, slideConfig, player, trigger } = data;
if (window.Quarto?.typesetMath) {
window.Quarto.typesetMath(slideNode);
}
});

};
});

};

})();
</script>


Expand Down
Loading

0 comments on commit 2b8c74c

Please sign in to comment.