Description
Currently our prompt generation is tied to our template format here: https://github.com/google/oss-fuzz-gen/tree/main/prompts/template_xml
We should make it easier and more flexible for others to test different prompt generation strategies, by allowing these custom prompts to be python modules instead that look something like the following:
def generate(benchmark: Benchmark) -> str:
...
i.e. the module would be expected to define a generate
function which produces a full prompt to pass to the LLM.
Similarly, we should also make our generation/evaluation loop more configurable, e.g. extract the logic here:
oss-fuzz-gen/run_one_experiment.py
Line 247 in 51a636b
into a driver.py that can be similarly replaced:
def evaluate(model: models.LLM, benchmark: Benchmark, prompt_generator: Module):
prompt = generator.generate(benchmark)
targets = generate_targets(model, prompt)
results = evaluate(targets)
...
And tying this all together, the resulting invocations would look something like:
./run_all_experiments --driver /path/to/driver.py --prompt_generator prompts/custom_generator.py