action tracing for anomaly detection #1038

dragonstyle · 2024-12-22T14:24:21Z

This PR introduces a new trace logging mode that is intended to detect actions which do not complete (and thus could be the cause of a run hanging). We'll be making significant enhancements to this soon (e.g. incorporating into the task UI) but the basic version implemented here is useful as a dianostic tool.

There are several things which may cause a task to not complete:

A generate() call to a model does not return.
A call to subprocess() without a timeout does not return.
An interaction with Docker containers or the Docker daemon does not terminate.
Writing to remote storage (e.g. S3) does not return.
A tool call does not return (some sort of infinite loop)
A subtask does not complete (again, some soft of infinite loop)

Trace logs aim to detect these by writing logs for every evaluation that record the beginning and end of each action (as well as cancellations and errors). Trace logs for the last 10 evaluations are preserved.

If an evaluation is running and is not terminating, you can execute the following command to list instances of the above actions (e.g. generate, docker, tool calls, etc.) that are still running:

inspect trace anomalies

You first see currently running actions (useful mostly for a "live" evaluation). If you have already cancelled an evaluation you'll see a list of cancelled actions (with the most recently completed cancelled action on top) which will often also tell you which unterminated action was keeping an evaluation from completing.

There will also be a list of actions that ended in an error. Note that actions which are cancelled (e.g. due to timeouts) or have errors (e.g. due to model API errors) aren't necessarily a problem as some timeouts and error conditions naturally occur in evaluations. They are there for additional context but are not the central purpose of inspect trace anomolies (which is to detect actions that unexpectedly fail to terminate).

You can also look at anomalies for a specific trace log (as opposed to the most recent one) as follows:

inspect trace anomalies trace.log

If you want to explore the trace files directly, you can list their paths with:

inspect trace list # --json for JSON output

Trace logs are in JSON lines, you can read them as a JSON array with:

inspect trace read trace.log

add trace fix

- sort by completed time (not start time) - display errors - display duration

dragonstyle and others added 24 commits December 20, 2024 13:26

Add trace log level

cb5e0eb

add trace fix

Add persistent trace logging

cbbe6d1

Convert sandbox messages

1d1c2cd

Convert eval log file operations

97f835f

Convert model calls

7c603b8

Convert cache

01a508b

Give trace actions unique ids

6bdbfc2

Add simple sampe logging

d050aa5

Add simple trace to task init

bfc1cf2

Correct old log mapping

f10cad5

Correct trace level

243daea

Merge remote-tracking branch 'origin/main' into feature/trace

74ef411

fix typing error

40acf7f

tweaks

dedeb1b

revisiosn to trace logging

a7c4a35

trace log using json lines

ab659f1

pydantic for trace log

84244d2

anomolies

7ea5875

get trace file path

24544ec

Basic trace anomoly logic

76963cd

backstop for when solvers fail to handle their own TimeoutError

b9aa123

timeout for docker listing operations

3f1fd06

timeout on write file

1c3afd5

fix formatting

f1b6a0f

jjallaire changed the title ~~Feature/trace~~ action tracing for anomaly detection Dec 23, 2024

jjallaire and others added 5 commits December 23, 2024 11:18

correct spelling for anomolies

d60813e

Don’t require trace file name (use current if none provided)

dea06b0

Improve trace output

961ef22

- sort by completed time (not start time) - display errors - display duration

fix formatting errors

cba0904

sort descending so last finished item is at the top

7a51490

jjallaire added 2 commits December 23, 2024 11:42

Update CHANGELOG.md

d2b821d

Merge branch 'main' into feature/trace

6d84d40

jjallaire self-requested a review December 23, 2024 16:42

jjallaire approved these changes Dec 23, 2024

View reviewed changes

jjallaire merged commit 36b2a7f into main Dec 23, 2024
9 checks passed

jjallaire deleted the feature/trace branch December 23, 2024 16:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

action tracing for anomaly detection #1038

action tracing for anomaly detection #1038

dragonstyle commented Dec 22, 2024 •

edited by jjallaire

Loading

action tracing for anomaly detection #1038

action tracing for anomaly detection #1038

Conversation

dragonstyle commented Dec 22, 2024 • edited by jjallaire Loading

dragonstyle commented Dec 22, 2024 •

edited by jjallaire

Loading