-
Notifications
You must be signed in to change notification settings - Fork 483
feat: support Nemotron using dual prompt strategy #1165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
1c03446
to
fa3419b
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## develop #1165 +/- ##
===========================================
+ Coverage 68.43% 68.64% +0.20%
===========================================
Files 161 161
Lines 15943 15970 +27
===========================================
+ Hits 10910 10962 +52
+ Misses 5033 5008 -25
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
Nemotron (and its family) is a hybrid reasoning model. the user can enable reasoning mode using specific system message: {
"role": "system",
"content": "detailed thinking on",
} The question is how to enable reasoning mode for all the tasks involved in guardrails.
without this, Nemotron is in normal mode without any reasoning traces. Please have a look at the example configs. The user can use Nemotron in normal model (withtout prompting_mode set to "reasoning) but still able to do something lik: rails.generate(
messages=[
{
"role": "system",
"content": "detailed thinking on",
},
{
"role": "user",
"content": "what can you do?",
},
]
) Expected output {"role": "assistant",
"content": "<think>\nOkay, let's see. The user just asked "what can you do?" again. The previous conversation shows that the bot already explained its capabilities once. The user's intent here is probably asking for more details or a repetition. But since the bot has already provided the capabilities response, maybe the bot should ask how it can assist further instead of repeating the same information.\n\nLooking at the bot intents, there's an example where the bot uses "ask how can i assist you further". So the correct response would be to prompt the user for more specific help rather than just repeating the capabilities. That makes sense because the user might be looking for examples or needs to be guided on how to ask for something specific. The bot should be proactive in offering assistance. Therefore, the appropriate bot message here is "How can I assist you further?".\n</think>How can I assist you further?"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This approach looks good. Let's get a documentation update happening ASAP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, we just need to update the docs to mention the nemotron models that support detailed reasoning.
engine: nim | ||
model: nvidia/llama-3.1-nemotron-ultra-253b-v1 | ||
reasoning_config: | ||
remove_reasoning_traces: True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't need this, do we?
Documentation preview |
close for future |
Nemotron Dual Mode requires Dual-Prompt Strategy: Reasoning Mode vs Standard Mode
Overview
This PR implements a dual-prompt strategy for Nemotron models in NeMo-Guardrails, allowing users to control the level of reasoning in model outputs by switching between reasoning mode and standard mode.
Key Changes
nemotron_reasoning.yml
with message-based prompts containing "detailed thinking on" system messagesnemotron_standard.yml
with message-based prompts without detailed thinkingWhy
Different tasks benefit from different reasoning approaches:
Implementation Details
Reasoning Mode: When
prompting_mode: reasoning
is set, uses prompts with detailed thinkinggenerate_bot_message
andgenerate_value
have two system messagesgenerate_user_intent
andgenerate_next_steps
have simpler formatStandard Mode: When using any other mode, uses prompts without detailed thinking
How to Test
Use the example configs in
examples/configs/nemotron/
:reasoning-mode
- activates detailed thinkingnormal-mode
- uses standard approach without detailed thinkingRun the test suite:
Example Usage
Documentation
A minimal README.md with usage examples and implementation details