SSEAL Self-Supervised Explorative Agent Learning

Setup

Install Python Packages

pip install -r requirements.txt

Environment

Populate a .env file in the root directory of the project with the following keys:

FIREWORKS_API_KEY=<key3>
ANTHROPIC_API_KEY=<key4>
OPENAI_API_KEY=<key5>

Example Usage

python -m src.main \
  --model, -m # Which model is used for the execution agent
  --model-type, -mt # This model's provider (eg. openai)
  --explore-model, -em # The model use for exploration
  --explore-model-type, emt  # The explore model's provider (eg. openai)
  --execute-temp, -et # What temp to execute at (default 0)
  --explore-temp, -ept # What tempt to explore at (default 0)
  --explore-environment-iterations, -eei # How many iterations of SSEAL to run.
  --max_iterations_per_episode, -mipe # How many iterations per query
  --benchmark_path, -bp # Which benchmark to run, should be a .py file in src/benchmarks
  --task, -t # Which task to run (eg. sports_data)

Example Command

Running gpt-4o-mini as agent with gpt-4o as the exploration model.

python -m src.main -bp linux_terminal.py --task linux_terminal -mipe 10 -eei 4 -m gpt-4o-mini -em gpt-4o

Caching

To save time, the exploration phase is cached in caches/. The cache is specific to a task, # of exporation iterations, and model. You can clear a cache by deleting it. The repo comes pre-loaded with caches for our provided test tasks.

Optimizing a Custom API

Let's say we have a py file containing our new function context. We first should put this into src/benchmarks/. For example, we can see a math_demo.py. Then we would run: python -m src.main -bp math_demo.py --task custom -mipe 0 -eei <num_exporation> -m gpt-4o -mt openai. In this case, -mipe 0 indicates that we don't want to execute any testing tasks, but just want to run prompt optimization. We could see the output of the exploration either in the log file generated (logs/) or get the metadata in cache/. It will be located in cache/<function_file_name>/<explore_model_name>.json. In particular, it will be under the key <num_exporation>.

Experiment Commands

The commands we ran for this project are in commands.sh. They can be run with source commands.sh. Running all of them will probably take many days. See analysis.ipynb for generating the graphs in our paper. We also provide the outputs from our experiments in experiments/.

Other Experiments

For our VLA experiments please see the vla_agents directory. For the SWE-agent experiments please see the swe-age directory.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
SWE-agent @ 0772c99		SWE-agent @ 0772c99
caches		caches
experiments		experiments
prompts		prompts
src		src
vla_agents		vla_agents
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
analysis.ipynb		analysis.ipynb
claude-3-5-sonnet-20240620_15_math_demo.py_custom_20241219_152542.log		claude-3-5-sonnet-20240620_15_math_demo.py_custom_20241219_152542.log
claude-3-5-sonnet-20240620_16_sports.py_sports_data_20241214_184619.log		claude-3-5-sonnet-20240620_16_sports.py_sports_data_20241214_184619.log
claude_demo.ipynb		claude_demo.ipynb
commands.sh		commands.sh
gpt4o_demo.ipynb		gpt4o_demo.ipynb
nexus_math.py		nexus_math.py
requirements.txt		requirements.txt
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SSEAL Self-Supervised Explorative Agent Learning

Setup

Install Python Packages

Environment

Example Usage

Example Command

Caching

Optimizing a Custom API

Experiment Commands

Other Experiments

About

Releases

Packages

Contributors 4

Languages

efrick2002/SSEAL

Folders and files

Latest commit

History

Repository files navigation

SSEAL Self-Supervised Explorative Agent Learning

Setup

Install Python Packages

Environment

Example Usage

Example Command

Caching

Optimizing a Custom API

Experiment Commands

Other Experiments

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages