Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix crash when parsing valid CycloneDX formats missing optional metadata fields #896

Closed

Conversation

kurt-r2c
Copy link

@kurt-r2c kurt-r2c commented May 30, 2023

Description of the PR

Fixes #169.

As of 0d557cb, guac still crashes on the example from #169 (referenced also in #199): https://github.com/JupiterOne/graph-github/blob/43fa5c43078eb16fad904fd1567133f9c296ea7f/reports/bom.json#L6-L11

image

The root cause is this line of code:

toplevel := c.getPackageElement(string(c.cdxBom.Metadata.Component.BOMRef))

The CycloneDX specification as of 1.4 does not require the component field in metadata, nor does it require the BOM ref field: https://cyclonedx.org/docs/1.4/json/#metadata_component_bom-ref.

guac was encountering a NPE when trying to read these uninitialized fields from the CycloneDX struct.

Some SBOM generators that only operate on package lockfiles (such as https://pypi.org/project/cyclonedx-bom/) do not generate the component metadata field, nor are they required to by the spec.

This change makes the existence of this field optional and "fails closed" to the nil handler introduced by #603 - I believe this preserves the intent of the design while gracefully handling the non-existence of this metadata field.

The test I added is not great - it does check that a schema without these fields in metadata parses but the predicate output is nil due to the existing handlers, which I did not want to make sweeping changes to. I leave this up to the maintainers to decide how this should behave.

PR Checklist

  • All commits have a Developer Certificate of Origin (DCO) -- they are generated using -s flag to git commit.
  • All new changes are covered by tests
  • If GraphQL schema is changed, make generate has been run
  • If collectsub protobuf has been changed, make proto has been run
  • All CI checks are passing (tests and formatting)
  • All dependent PRs have already been merged

@google-cla
Copy link

google-cla bot commented May 30, 2023

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@kurt-r2c kurt-r2c force-pushed the kb/component-metadata-spec-adherence branch from 255c0d1 to 7838840 Compare May 30, 2023 23:30
toplevel := c.getPackageElement(string(c.cdxBom.Metadata.Component.BOMRef))
var toplevel []*model.PkgInputSpec = nil

if c.cdxBom.Metadata.Component != nil && c.cdxBom.Metadata.Component.BOMRef != "" {
Copy link
Collaborator

@pxp928 pxp928 May 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @kurt-r2c! Thanks for the PR! This is something that warrants further discussion/thought on how it should be handled in GUAC, as we could potentially not have a top-level component. I will keep you posted and we can update this PR as needed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering what the use case of having a CDX document without the top level component is for... if it is the case where it is interpretted as here's a list of all the components I use, then it should parse to a bunch of singletons, however, if it is not , then my thought would be to either use the generic guac/cdx/<name> or to just reject it. let's make sure we have this on the priority for triage.

@pxp928
Copy link
Collaborator

pxp928 commented Jun 6, 2023

Hey @kurt-r2c, could you elaborate a bit more on your use case here? Based on the PR, it would generate component nodes without a top level component field in metadata. This would result in a disconnected node without a parent. How would you tie this information back to the package that the SBOM is being generated for?

@kurt-r2c
Copy link
Author

kurt-r2c commented Jun 8, 2023

@pxp928 I was just trying to ingest a Python CycloneDX SBOM generated for a project using https://pypi.org/project/cyclonedx-bom/ to start exploring the tool. The SBOM output from cyclonedx-bom is valid according to the spec - if guac needs additional context to function, it needs to explicitly require some additional arguments/constraints.

@pxp928
Copy link
Collaborator

pxp928 commented Jun 9, 2023

@pxp928 I was just trying to ingest a Python CycloneDX SBOM generated for a project using https://pypi.org/project/cyclonedx-bom/ to start exploring the tool. The SBOM output from cyclonedx-bom is valid according to the spec - if guac needs additional context to function, it needs to explicitly require some additional arguments/constraints.

It seems like for now we will need to output a message that this is not currently supported at this time and work through a solution or some type of validation on the data before it is ingested into GUAC. I will create a PR to handle this use case gracefully.

@lumjjb
Copy link
Contributor

lumjjb commented Jun 9, 2023

Yea let's perhaps add a message and link to this issue for discussion in the warning message. In case someone else comes along with a use case and a way to handle it.

@pxp928
Copy link
Collaborator

pxp928 commented Jun 28, 2023

Thanks @kurt-r2c for opening the PR and bringing this issue to our attention. We have created issue #976 to track this and also PR #992 to ensure that GUAC fails with an error message. Closing this PR as it it no longer needed.

@pxp928 pxp928 closed this Jun 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Metadata document Quality Thread
3 participants