Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ingestion/data-quality issue] CycloneDX 'pedigree' information #2313

Open
mrizzi opened this issue Nov 25, 2024 · 0 comments
Open

[ingestion/data-quality issue] CycloneDX 'pedigree' information #2313

mrizzi opened this issue Nov 25, 2024 · 0 comments
Labels
bug Something isn't working data-quality Things related to data quality and document ingestion data-sources

Comments

@mrizzi
Copy link
Collaborator

mrizzi commented Nov 25, 2024

Describe the bug
I would like to understand if an how in the Guac community there has been any investigation on how to ingest into guac's ontology the information from the CycloneDX pedigree, ref. https://cyclonedx.org/docs/1.5/json/#components_items_pedigree

To Reproduce
No data ingested into Guac from the CycloneDX pedigree.

Expected behavior
Nothing is expected, this issue is, first of all, for discussion.

Screenshots
If applicable, add screenshots to help explain your problem.

GUAC version
Current main branch, i.e. ea7ffaf

Ingested document(s)

Can you share the documents that are used to reproduce the ingestion errors or showcase the data quality issues.

Additional context
To get the discussion started, I've drafted a first attempt for mapping CycloneDX pedigree into guac's ontology:

ancestor

  • Package for the ancestor component
  • IsDependency with the ancestor being the dependency package
    • With customized “justification”? e.g. Derived from CDX Pedigree ancestor relationship , consistent with SPDX approach

descendant

  • Package for the descendant component
  • IsDependency with the ancestor being the dependant package
    • With customized “justification”? e.g. Derived from CDX Pedigree descendant relationship , consistent with SPDX approach

variant

  • Package for the variant component
  • IsDependency with
    • Which is dependency and which is the dependent package? By definition the relation is unknown so I feel like some consistent opinionated approach should be applied, e.g. the variant package being the dependency package (hence the “main” component being the dependent package)
    • With customized “justification”? e.g. Derived from CDX Pedigree variant relationship , consistent with SPDX approach
    • DependencyType value must be “UNKNOWN”

commit

  • Source noun with
    • uid into the commit field
    • type?
    • namespace?
    • name?
      Maybe derive them all together, in some way, from the “url” value? It looks opinionated
  • HasSourceAt verb to connect the component and the Source/commit nouns
  • HasMetadata verb to store commit’s:
    • url
    • One for each “author” data, i.e. timestamp, name, email
    • One for each “committer” data, i.e. timestamp, name, email
    • message

patch

  • Very hard to define a match with the Guac data model

note

  • HasMetadata verb to store each note
@mrizzi mrizzi added bug Something isn't working data-quality Things related to data quality and document ingestion data-sources labels Nov 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working data-quality Things related to data quality and document ingestion data-sources
Projects
None yet
Development

No branches or pull requests

1 participant