feat: support Nemotron using dual prompt strategy #1165

Pouyanpi · 2025-05-01T15:25:34Z

Nemotron Dual Mode requires Dual-Prompt Strategy: Reasoning Mode vs Standard Mode

Overview

This PR implements a dual-prompt strategy for Nemotron models in NeMo-Guardrails, allowing users to control the level of reasoning in model outputs by switching between reasoning mode and standard mode.

Key Changes

Created nemotron_reasoning.yml with message-based prompts containing "detailed thinking on" system messages
Created nemotron_standard.yml with message-based prompts without detailed thinking
Added comprehensive tests to verify prompt selection behavior
Added documentation in both prompt directory and example configs

Why

Different tasks benefit from different reasoning approaches:

Complex problem-solving and step-by-step analysis benefit from detailed reasoning
Simple conversational tasks are more efficient with direct responses
Users need the flexibility to choose based on their specific use case

Implementation Details

Reasoning Mode: When prompting_mode: reasoning is set, uses prompts with detailed thinking
- Tasks like generate_bot_message and generate_value have two system messages
- Tasks like generate_user_intent and generate_next_steps have simpler format
Standard Mode: When using any other mode, uses prompts without detailed thinking
- Still uses message-based format (not content-based)
- More concise, direct responses without extensive reasoning
- Default behavior if no prompting_mode is specified

How to Test

Use the example configs in examples/configs/nemotron/:
- reasoning-mode - activates detailed thinking
- normal-mode - uses standard approach without detailed thinking

Run the test suite:

pytest tests/test_nemotron_prompt_modes.py -v

Example Usage

# For reasoning mode
config = RailsConfig.from_path("examples/configs/nemotron/reasoning-mode")
rails = LLMRails(config)

# For standard mode
config = RailsConfig.from_path("examples/configs/nemotron/normal-mode")
rails = LLMRails(config)

Documentation

A minimal README.md with usage examples and implementation details

codecov-commenter · 2025-05-14T16:36:15Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 68.64%. Comparing base (36d625e) to head (03c9f8b).
Report is 3 commits behind head on develop.

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #1165      +/-   ##
===========================================
+ Coverage    68.43%   68.64%   +0.20%     
===========================================
  Files          161      161              
  Lines        15943    15970      +27     
===========================================
+ Hits         10910    10962      +52     
+ Misses        5033     5008      -25

Flag	Coverage Δ
python	`68.64% <100.00%> (+0.20%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
nemoguardrails/rails/llm/llmrails.py	`87.21% <100.00%> (+0.09%)`	⬆️

... and 2 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Pouyanpi · 2025-05-14T17:11:57Z

@mikemckiernan

Nemotron (and its family) is a hybrid reasoning model. the user can enable reasoning mode using specific system message:

        {
            "role": "system",
            "content": "detailed thinking on",
        }

The question is how to enable reasoning mode for all the tasks involved in guardrails.

the user should set prompting_mode: "reasoning" to config.yml,

without this, Nemotron is in normal mode without any reasoning traces. Please have a look at the example configs.

The user can use Nemotron in normal model (withtout prompting_mode set to "reasoning) but still able to do something lik:

rails.generate(
    messages=[
        {
            "role": "system",
            "content": "detailed thinking on",
        },
        {
            "role": "user",
            "content": "what can you do?",
        },
    ]
)

Expected output

{"role": "assistant",
 "content": "<think>\nOkay, let's see. The user just asked "what can you do?" again. The previous conversation shows that the bot already explained its capabilities once. The user's intent here is probably asking for more details or a repetition. But since the bot has already provided the capabilities response, maybe the bot should ask how it can assist further instead of repeating the same information.\n\nLooking at the bot intents, there's an example where the bot uses "ask how can i assist you further". So the correct response would be to prompt the user for more specific help rather than just repeating the capabilities. That makes sense because the user might be looking for examples or needs to be guided on how to ask for something specific. The bot should be proactive in offering assistance. Therefore, the appropriate bot message here is "How can I assist you further?".\n</think>How can I assist you further?"}

cparisien

This approach looks good. Let's get a documentation update happening ASAP.

trebedea

Looks good, we just need to update the docs to mention the nemotron models that support detailed reasoning.

update readme fix update

Pouyanpi · 2025-05-15T11:02:50Z

examples/configs/nemotron/reasoning-mode/config.yml

+    engine: nim
+    model: nvidia/llama-3.1-nemotron-ultra-253b-v1
+    reasoning_config:
+      remove_reasoning_traces: True


we don't need this, do we?

github-actions · 2025-05-15T11:03:24Z

Documentation preview

https://nvidia.github.io/NeMo-Guardrails/review/pr-1165

Pouyanpi · 2025-05-15T20:52:42Z

close for future

Pouyanpi added 3 commits May 14, 2025 17:41

feat: add nemotron model to all tasks

b49af71

nemotron reasoning mode

b405dc4

example config

fa3419b

Pouyanpi force-pushed the feat/nemotron-support branch from 1c03446 to fa3419b Compare May 14, 2025 16:32

disable reasoning mode for generate_user_intent and generate_next_steps

00fcd89

Pouyanpi requested review from trebedea and cparisien May 14, 2025 17:04

Pouyanpi marked this pull request as ready for review May 14, 2025 17:04

Pouyanpi self-assigned this May 14, 2025

Pouyanpi added this to the v0.14.0 milestone May 14, 2025

cparisien approved these changes May 14, 2025

View reviewed changes

Pouyanpi added the enhancement New feature or request label May 14, 2025

trebedea approved these changes May 15, 2025

View reviewed changes

Pouyanpi added 2 commits May 15, 2025 11:46

add tests

d753474

updates

a0147aa

update readme fix update

Pouyanpi commented May 15, 2025

View reviewed changes

Pouyanpi requested a review from trebedea May 15, 2025 11:44

Pouyanpi changed the title ~~feat: add nemotron model to all tasks~~ feat: support Nemotron using dual prompt strategy May 15, 2025

Pouyanpi added 3 commits May 15, 2025 21:38

feat(llm): add support for system messages in events

f8aabdb

add tests for system message conversion

300adfb

bug fix for 3.11 and 3.10

03c9f8b

Pouyanpi closed this May 15, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: support Nemotron using dual prompt strategy #1165

feat: support Nemotron using dual prompt strategy #1165

Uh oh!

Pouyanpi commented May 1, 2025 •

edited

Loading

Uh oh!

codecov-commenter commented May 14, 2025 •

edited

Loading

Uh oh!

Pouyanpi commented May 14, 2025

Uh oh!

cparisien left a comment

Uh oh!

trebedea left a comment

Uh oh!

Pouyanpi May 15, 2025

Uh oh!

github-actions bot commented May 15, 2025

Uh oh!

Pouyanpi commented May 15, 2025

Uh oh!

Uh oh!

feat: support Nemotron using dual prompt strategy #1165

feat: support Nemotron using dual prompt strategy #1165

Uh oh!

Conversation

Pouyanpi commented May 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Nemotron Dual Mode requires Dual-Prompt Strategy: Reasoning Mode vs Standard Mode

Overview

Key Changes

Why

Implementation Details

How to Test

Example Usage

Documentation

Uh oh!

codecov-commenter commented May 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Pouyanpi commented May 14, 2025

Uh oh!

cparisien left a comment

Choose a reason for hiding this comment

Uh oh!

trebedea left a comment

Choose a reason for hiding this comment

Uh oh!

Pouyanpi May 15, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented May 15, 2025

Documentation preview

Uh oh!

Pouyanpi commented May 15, 2025

Uh oh!

Uh oh!

Pouyanpi commented May 1, 2025 •

edited

Loading

codecov-commenter commented May 14, 2025 •

edited

Loading