Skip to content

Conversation

ArangoGutierrez
Copy link
Collaborator

@ArangoGutierrez ArangoGutierrez commented Sep 22, 2025

Fixes #1225

Previously, when an unrecognized CDI hook was invoked with command-line flags, the nvidia-cdi-hook command would fail with a flag parsing error instead of gracefully handling the unsupported hook. This could cause container launch failures when CDI specifications reference hooks that are not yet supported by the current NVIDIA Container Toolkit version.

This PR enhances the error handling in nvidia-cdi-hook to gracefully handle unrecognized CDI hooks that include command-line flags. The changes ensure that:
Unrecognized hooks with flags are handled gracefully - Instead of failing with flag parsing errors, the command now issues a warning and continues
Backwards compatibility is maintained - Existing behavior for supported hooks remains unchanged
Container launches don't fail - Unsupported hooks result in warnings rather than errors
Changes
Added OnUsageError handler in ConfigureCDIHookCommand to detect flag parsing errors for unrecognized commands
Added strings import for error message parsing to identify flag-related errors
Enhanced error detection to specifically catch "flag provided but not defined" errors for unrecognized hooks
Improved warning consistency by reusing the existing IssueUnsupportedHookWarning logic

Test Cases

The following test cases demonstrate the improved behavior:

Before (failing):

$ ./nvidia-cdi-hook unknown-hook --some-flag value
Error: flag provided but not defined: -some-flag

After (graceful handling):

$ ./nvidia-cdi-hook unknown-hook --some-flag value
time="2025-09-22T11:25:26+02:00" level=warning msg="Unsupported CDI hook: unknown-hook"

Additional test cases:

# Unrecognized hook without flags
$ ./nvidia-cdi-hook unknown-hook
time="2025-09-22T11:25:26+02:00" level=warning msg="Unsupported CDI hook: unknown-hook"

# No hook specified
$ ./nvidia-cdi-hook
time="2025-09-22T11:25:26+02:00" level=warning msg="No CDI hook specified"

# Recognized hook with valid flags (unchanged behavior)
$ ./nvidia-cdi-hook update-ldcache --debug
# Works as expected, no warning

# Recognized hook with invalid flags (unchanged behavior)
$ ./nvidia-cdi-hook update-ldcache --invalid-flag
Error: flag provided but not defined: -invalid-flag

Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds warning functionality for unrecognized CDI hooks that include flags, addressing issue #1225. The change enhances error handling to catch cases where unknown hooks are invoked with command-line flags.

  • Adds a OnUsageError handler to detect unrecognized hooks with flags
  • Imports the strings package for string manipulation in error checking

@ArangoGutierrez ArangoGutierrez added this to the v1.18.0 milestone Sep 22, 2025
@coveralls
Copy link

coveralls commented Sep 22, 2025

Pull Request Test Coverage Report for Build 18008895528

Details

  • 0 of 62 (0.0%) changed or added relevant lines in 3 files are covered.
  • 5 unchanged lines in 1 file lost coverage.
  • Overall coverage decreased (-0.1%) to 36.168%

Changes Missing Coverage Covered Lines Changed/Added Lines %
cmd/nvidia-ctk/hook/hook.go 0 2 0.0%
cmd/nvidia-cdi-hook/main.go 0 28 0.0%
cmd/nvidia-cdi-hook/commands/commands.go 0 32 0.0%
Files with Coverage Reduction New Missed Lines %
cmd/nvidia-cdi-hook/main.go 5 0.0%
Totals Coverage Status
Change from base Build 17981864462: -0.1%
Covered Lines: 4827
Relevant Lines: 13346

💛 - Coveralls

@ArangoGutierrez
Copy link
Collaborator Author

Tested

./nvidia-cdi-hook unknown-hook --some-flag value 
time="2025-09-22T11:25:26+02:00" level=warning msg="Unsupported CDI hook: unknown-hook"

@ArangoGutierrez ArangoGutierrez marked this pull request as ready for review September 22, 2025 09:35
OnUsageError: func(ctx context.Context, cmd *cli.Command, err error, isSubcommand bool) error {
// Check if this is a "flag provided but not defined" error
errMsg := err.Error()
if strings.HasPrefix(errMsg, "flag provided but not defined: ") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One nit: The err in urfave is defined as:

providedButNotDefinedErrMsg = "flag provided but not defined: -"

and we don't include the - here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah yeah, let me edit!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

edited

elezar
elezar previously requested changes Sep 24, 2025
Copy link
Member

@elezar elezar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about the case of a subcommand that does not accept arguments:

$ ./nvidia-cdi-hook foo
No help topic for 'foo'
$ echo $?
3

Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 2 comments.

OnUsageError: func(ctx context.Context, cmd *cli.Command, err error, isSubcommand bool) error {
errMsg := err.Error()

// Check if this is a "No help topic for" error (unrecognized
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since in Urfave we have a CommandNotFound handler does it not make sense to handle this separately?

elezar
elezar previously requested changes Sep 24, 2025
Copy link
Member

@elezar elezar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing that I forgot to mention is that we would also have to update the nvidia-ctk hook command.

},
// Handle unrecognized commands when help is requested (e.g., help
// unknowncommand)
CommandNotFound: func(ctx context.Context, cmd *cli.Command, commandName string) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These and the Action are now strictly duplicated from nvidia-cdi-hook. I don't think this is required with urfave/v3. Do we want to refactor this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a small refactor, creating a shared function so we don't have duplicated code

}
}

// NewHookCommand creates a new hook command with common behavior for handling
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: This is specifically CDI hooks, so maybe rename NewCDIHookCommand?

// Handle unrecognized commands when help is requested (e.g., help
// unknowncommand)
CommandNotFound: func(ctx context.Context, cmd *cli.Command, commandName string) {
logger.Warningf("Unsupported CDI hook: %v", commandName)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: Is there a reason that we don't call IssueUnsupportedHookWarning (or at least reuse some of the logic there)?

if len(args) > 0 && cmd.Command(args[0]) == nil {
// This is an unrecognized hook with flags - the default
// Action will handle it
return nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we not issue a warning here?

opts := options{}

// Create the top-level CLI
c := cli.Command{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to still construct the Commands and then call a fuction to modify the members that we want instead?

Comment on lines 58 to 61
// ConfigureCDIHookCommand takes a base command and adds the common CDI hook
// behavior for handling unsupported hooks.
// The base command should have Name, Usage, and Version set as desired.
func ConfigureCDIHookCommand(base *cli.Command, logger logger.Interface) *cli.Command {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment seems to indicate that we should call the functions something like AddUnsupportedHooksChecks but we do add the subcommands at the end, so it's probably just the comment that needs to be udpated for accuracy.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Func doc comment has been updated

return nil
}
// Set log-level for all subcommands
c.Before = func(ctx context.Context, cmd *cli.Command) (context.Context, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Givent that we're only ADDing to the hook in the new function, I would not expect this to be part of the diff.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The diff has been updated

@elezar
Copy link
Member

elezar commented Sep 25, 2025

Let's update the description to show the output for the various test cases.

Comment on lines 58 to 59
// ConfigureCDIHookCommand takes a base command and configures it to handle
// unsupported CDI hooks gracefully.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still not accurate.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about now?

Copy link
Member

@elezar elezar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are still unaddressed comments.

elezar and others added 2 commits September 26, 2025 13:51
This change moves the common logic for the nvidia-cdi-hook and
nvidia-ctk hook commands to the commands package.

Signed-off-by: Evan Lezar <[email protected]>
The initial implementation for skipping unknown hooks did not
properly handle hooks with flags. This change ensures that
these are properly handled.

Signed-off-by: Carlos Eduardo Arango Gutierrez <[email protected]>
@elezar elezar changed the title Print a warning on an unrecognised CDI hook Fix handling of unrecognised CDI hooks Sep 26, 2025
@elezar elezar merged commit fe90b00 into NVIDIA:main Sep 26, 2025
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CLI flags should be ignored when an unrecognised hook is detected
3 participants