Task/jailbreak adv sim #3455

nagkumar91 · 2024-06-21T18:10:08Z

Description

Adding an additional way to utilize the jailbreak feature of the adversarial simulator.

Our customers who use the Adversarial Simulator with the jailbreak=True option, always try to run the simulator twice, to simulate with and without jailbreak.

This PR introduces a new JailbreakAdversarialSimulator which runs the simulator twice. Once with jailbreak=True and once with jailbreak=False to accomplish our customers' goals of running the simulator for an adversarial dataset.

All Promptflow Contribution checklist:

The pull request does not introduce [breaking changes].
CHANGELOG is updated for new features, bug fixes or other significant changes.
I have read the contribution guidelines.
Create an issue and link to the pull request to get dedicated review from promptflow team. Learn more: suggested workflow.

General Guidelines and Best Practices

Title of the pull request is clear and informative.
There are a small number of commits, each of which have an informative message. This means that previously merged commits do not appear in the history of the PR. For more information on cleaning up the commits in your PR, see this page.

Testing Guidelines

Pull request includes test coverage for the included changes.

This reverts commit a4b20f1.

github-actions · 2024-06-21T18:18:45Z

promptflow-evals test result

9 files - 3 9 suites - 3 2h 38m 36s ⏱️ + 2h 17m 55s
54 tests - 30 47 ✅ - 37 7 💤 + 7 0 ❌ ±0
486 runs - 522 423 ✅ - 585 63 💤 +63 0 ❌ ±0

Results for commit c5f142f. ± Comparison against base commit ac574fa.

This pull request removes 84 and adds 54 tests. Note that renamed tests count towards both.

tests.evals.unittests.test_batch_run_context.TestBatchRunContext ‑ test_with_codeclient
tests.evals.unittests.test_batch_run_context.TestBatchRunContext ‑ test_with_pfclient
tests.evals.unittests.test_built_in_evaluator.TestBuiltInEvaluators ‑ test_fluency_evaluator
tests.evals.unittests.test_built_in_evaluator.TestBuiltInEvaluators ‑ test_fluency_evaluator_empty_string
tests.evals.unittests.test_built_in_evaluator.TestBuiltInEvaluators ‑ test_fluency_evaluator_non_string_inputs
tests.evals.unittests.test_chat_evaluator.TestChatEvaluator ‑ test_conversation_validation_invalid_citations
tests.evals.unittests.test_chat_evaluator.TestChatEvaluator ‑ test_conversation_validation_missing_role
tests.evals.unittests.test_chat_evaluator.TestChatEvaluator ‑ test_conversation_validation_normal
tests.evals.unittests.test_chat_evaluator.TestChatEvaluator ‑ test_conversation_validation_question_answer_not_paired
tests.evals.unittests.test_chat_evaluator.TestChatEvaluator ‑ test_per_turn_results_aggregation
…

tests.evals.e2etests.test_adv_simulator.TestAdvSimulator ‑ test_adv_conversation_sim_responds_with_responses
tests.evals.e2etests.test_adv_simulator.TestAdvSimulator ‑ test_adv_qa_sim_responds_with_one_response
tests.evals.e2etests.test_adv_simulator.TestAdvSimulator ‑ test_adv_rewrite_sim_responds_with_responses
tests.evals.e2etests.test_adv_simulator.TestAdvSimulator ‑ test_adv_sim_init_with_prod_url
tests.evals.e2etests.test_adv_simulator.TestAdvSimulator ‑ test_adv_summarization_jailbreak_sim_responds_with_responses
tests.evals.e2etests.test_adv_simulator.TestAdvSimulator ‑ test_adv_summarization_sim_responds_with_responses
tests.evals.e2etests.test_adv_simulator.TestAdvSimulator ‑ test_incorrect_scenario_raises_error
tests.evals.e2etests.test_builtin_evaluators.TestBuiltInEvaluators ‑ test_composite_evaluator_chat[False-True]
tests.evals.e2etests.test_builtin_evaluators.TestBuiltInEvaluators ‑ test_composite_evaluator_chat[True-True]
tests.evals.e2etests.test_builtin_evaluators.TestBuiltInEvaluators ‑ test_composite_evaluator_content_safety
…

♻️ This comment has been updated with latest results.

src/promptflow-evals/promptflow/evals/synthetic/adversarial_simulator.py

src/promptflow-evals/promptflow/evals/synthetic/_conversation/__init__.py

src/promptflow-evals/promptflow/evals/synthetic/jailbreak_adversarial_simulator.py

singankit · 2024-07-03T22:15:22Z

src/promptflow-evals/promptflow/evals/synthetic/jailbreak_adversarial_simulator.py

+        * "subscription_id": Azure subscription ID.
+        * "resource_group_name": Name of the Azure resource group.
+        * "project_name": Name of the Azure Machine Learning workspace.
+        * "credential": Azure credentials object for authentication.


This should be a separate param ?

singankit · 2024-07-03T22:19:12Z

src/promptflow-evals/promptflow/evals/synthetic/jailbreak_adversarial_simulator.py

+        target: Callable,
+        max_conversation_turns: int = 1,
+        max_simulation_results: int = 3,
+        api_call_retry_limit: int = 3,


Should we think about them being some kind of retry config param to make it cleaner ?

nagkumar91 added 4 commits June 20, 2024 12:00

Handling the exception while creating jinja template

a4b20f1

Revert "Handling the exception while creating jinja template"

338ba26

This reverts commit a4b20f1.

Handling the exception while creating jinja template

28eea9d

Non adversarial simulator changes for jailbreak scenario

92bcf0b

github-actions bot added the promptflow-evals label Jun 21, 2024

wangchao1230 reviewed Jun 27, 2024

View reviewed changes

src/promptflow-evals/promptflow/evals/synthetic/adversarial_simulator.py Outdated Show resolved Hide resolved

wangchao1230 reviewed Jun 27, 2024

View reviewed changes

src/promptflow-evals/promptflow/evals/synthetic/adversarial_simulator.py Outdated Show resolved Hide resolved

wangchao1230 reviewed Jun 27, 2024

View reviewed changes

src/promptflow-evals/promptflow/evals/synthetic/_conversation/__init__.py Outdated Show resolved Hide resolved

Nagkumar Arkalgud added 3 commits June 27, 2024 10:02

Merge branch 'main' into task/jailbreak_adv_sim

9c4fe60

added a jailbreak adv simulator

908a8d4

Merge branch 'main' into task/jailbreak_adv_sim

babda22

nagkumar91 marked this pull request as ready for review June 27, 2024 20:50

nagkumar91 requested a review from a team as a code owner June 27, 2024 20:50

Nagkumar Arkalgud and others added 5 commits June 27, 2024 14:42

jb sim calls the adv sim

2eb4e6c

Remove the unwanted message

a8771f2

Add example for return value

5d10fce

Merge branch 'main' into task/jailbreak_adv_sim

8e41461

Indent fix for the doc

b218177

nagkumar91 requested a review from wangchao1230 July 1, 2024 18:12

Nagkumar Arkalgud and others added 6 commits July 1, 2024 11:25

Build docs lint

57c7834

Build docs lint

4e80579

Build docs lint

a0a4cf9

Build docs lint

9ad9ed0

Build docs lint

2c09984

Merge branch 'main' into task/jailbreak_adv_sim

269b991

wangchao1230 reviewed Jul 3, 2024

View reviewed changes

src/promptflow-evals/promptflow/evals/synthetic/jailbreak_adversarial_simulator.py Outdated Show resolved Hide resolved

nagkumar91 and others added 3 commits July 2, 2024 21:28

Merge branch 'main' into task/jailbreak_adv_sim

1ea6124

Merge branch 'main' into task/jailbreak_adv_sim

2b010ae

update docstring

c5f142f

singankit reviewed Jul 3, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Task/jailbreak adv sim #3455

Task/jailbreak adv sim #3455

nagkumar91 commented Jun 21, 2024 •

edited

Loading

github-actions bot commented Jun 21, 2024 •

edited

Loading

singankit Jul 3, 2024

singankit Jul 3, 2024

Task/jailbreak adv sim #3455

Are you sure you want to change the base?

Task/jailbreak adv sim #3455

Conversation

nagkumar91 commented Jun 21, 2024 • edited Loading

Description

All Promptflow Contribution checklist:

General Guidelines and Best Practices

Testing Guidelines

github-actions bot commented Jun 21, 2024 • edited Loading

promptflow-evals test result

singankit Jul 3, 2024

Choose a reason for hiding this comment

singankit Jul 3, 2024

Choose a reason for hiding this comment

nagkumar91 commented Jun 21, 2024 •

edited

Loading

github-actions bot commented Jun 21, 2024 •

edited

Loading