Requirements
- Shinkai API running on http://localhost:9950 (or overwritten by env SHINKAI_API_URL)
- Ollama running on http://localhost:11434 (or overwritten by env OLLAMA_API_URL)
- Deno 2.x
deno task start test=benchmark-\*
deno task start test=download-url-and-sql
- Runs only thedownload-url-and-sql
testdeno task start test=download-url-and-sql test=download-url-and-summary
- Runs thedownload-url-and-sql
anddownload-url-and-summary
tests- Wildcards can be used:
deno task start test=benchmark-\*
- Results will be stored in
results/{language}/{model-name}/{test-code}/
prompt-
stores promptsraw-
store raw responsessrc-
store parsed response (valid Typescript or JSON)shinkai_local_tools.py
stores the dynamic tools file used in the testshinkai-local-tools.ts
stores the dynamic tools file used in the testexecute-output
stores the output of the executed code
The models.txt
file allows you to specify which models to use for testing. Each line in the file should specify a model in the format:
ollama:qwen2.5-coder:32b
- Engine Prefix: The prefix (
ollama
in the example) indicates which engine the model belongs to. - Model Name: The rest of the line specifies the model name, which can include colons.
If models.txt
is not present, the system will default to using models obtained from the getInstalledModels()
function.