-
Notifications
You must be signed in to change notification settings - Fork 595
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Bitnami embedded SBOMs #3065
Comments
Is there another way to scan these artifacts? Are these container images in some differing format from OCI? If the only way to identify what is installed is by scanning an SBOM, there could probably just be a Bitnami cataloger that looks for specific SBOMs in these known bitnami locations, instead of enabling the SBOM cataloger itself. It's pretty easy to just pass a reader to the SBOM decoder. And then we'd probably want to have a way to prevent SBOMs from getting scanned twice if a user does enable the SBOM cataloger. |
Two questions for investigation:
The easy path to implement this is essentially a copy of the SBOM cataloger with a much narrower file glob, assuming it doesn't cause duplicates or miss critical information. |
I did an experiment to answer this.
❯ go run ./cmd/syft -q --select-catalogers "-sbom-cataloger,+bitnami-cataloger" bitnami/moodle:4.4 -o json |\
jq -r '.artifacts[] | select(.foundBy == "bitnami-cataloger" or .foundBy == "sbom-cataloger") | .name' |\
shasum
b07dd9b416f25edca5e143218ac6474360980fce -
❯ go run ./cmd/syft -q --select-catalogers "+sbom-cataloger,+bitnami-cataloger" bitnami/moodle:4.4 -o json |\
jq -r '.artifacts[] | select(.foundBy == "bitnami-cataloger" or .foundBy == "sbom-cataloger") | .name' |\
shasum
b07dd9b416f25edca5e143218ac6474360980fce -
❯ go run ./cmd/syft -q --select-catalogers "+sbom-cataloger,-bitnami-cataloger" bitnami/moodle:4.4 -o json |\
jq -r '.artifacts[] | select(.foundBy == "bitnami-cataloger" or .foundBy == "sbom-cataloger") | .name' |\
shasum
b07dd9b416f25edca5e143218ac6474360980fce - So I think the answer to question 1 is, "at least as it stands right now, Syft's existing deduplication logic works fine if both catalogers are on." Of course, in this experiment the catalogers are identical, but it's still a good sign on question 1 above. |
I've attached the SBOM syft makes in my experiment: go run ./cmd/syft -q --override-default-catalogers "bitnami-cataloger" bitnami/moodle:4.4 -o spdx >/tmp/from-syft-bitnami.spdx.txt |
@willmurphyscode I started working on this and I realized that packages detected by a new "Bitnami" cataloger are given the type I guess this could complicate how to manage duplicates reported by both sbom and bitnami catalogers but I guess we could use the PURL for that. |
Hi @juan131 (cc @wagoodman), Some thoughts here:
In short:
|
|
Yes. In the same SPDX file packages from different ecosystems can coexist. Take the examples below (taken from
{
"SPDXID": "SPDXRef-kubectl",
"name": "kubectl",
"versionInfo": "1.31.1-1",
"downloadLocation": "git+https://github.com/kubernetes/kubernetes#refs/tags/v1.31.1",
"licenseConcluded": "Apache-2.0",
"licenseDeclared": "Apache-2.0",
"filesAnalyzed": false,
"externalRefs": [
{
"referenceCategory": "SECURITY",
"referenceType": "cpe23Type",
"referenceLocator": "cpe:2.3:*:kubectl:kubectl:1.31.1:*:*:*:*:*:*:*"
},
{
"referenceCategory": "PACKAGE-MANAGER",
"referenceType": "purl",
"referenceLocator": "pkg:bitnami/[email protected]?arch=arm64&distro=debian-12"
}
],
"copyrightText": "NOASSERTION"
}
{
"name": "github.com/MakeNowJust/heredoc",
"SPDXID": "SPDXRef-Package-808f8a3a08f58be6",
"versionInfo": "v1.0.0",
"supplier": "NOASSERTION",
"downloadLocation": "NONE",
"filesAnalyzed": false,
"sourceInfo": "opt/bitnami/kubectl/bin/kubectl",
"licenseConcluded": "NONE",
"licenseDeclared": "NONE",
"externalRefs": [
{
"referenceCategory": "PACKAGE-MANAGER",
"referenceType": "purl",
"referenceLocator": "pkg:golang/github.com/makenowjust/[email protected]"
}
],
"primaryPackagePurpose": "LIBRARY",
"copyrightText": "NOASSERTION"
} Then, relationships links them: "relationships": [
{
"spdxElementId": "SPDXRef-kubectl",
"relationshipType": "CONTAINS",
"relatedSpdxElement": "SPDXRef-Application-b66f42f85c68bc03-kubectl"
},
{
"spdxElementId": "SPDXRef-Application-b66f42f85c68bc03-kubectl",
"relatedSpdxElement": "SPDXRef-Package-808f8a3a08f58be6",
"relationshipType": "DEPENDS_ON"
}
Yes, Bitnami packages can be simply compiled binaries (based on Golang, C, C++, etc.) but they can be also apps written in interpreted languages (e.g. PHP or Node.JS apps)
I don't think that's necessary. As @westonsteimel mentioned, they're recognized as a valid PURL package type. |
Hi @juan131! Thanks @westonsteimel - I did not realize bitnami was an official PURL package type - I thought we would be inventing the package type for the sake of this cataloger. It looks like there are already PURLs with package types in the bitnami SPDX? I propose we do the following:
@westonsteimel and @wagoodman do you all agree? |
I think that makes sense @willmurphyscode !! Regarding the 3rd point, when you talk about packages from Bitnami but not |
By the way, I added support for the Bitnami pURL type at anchore/packageurl-go#22 |
I thought you told us that there were packages in bitnami SPDX files that are have a different purl type:
from the second example in #3065 (comment). So what I was trying to talk about was: Packages that are declared in a Bitnami SPDX manifest and are therefore found by the new bitnami cataloger but, because the bitnami SDPX declares them with a different PURL type, they do not have package type bitnami. Heredoc in your post above is such a package. @juan131 does that make sense? |
I see your point @willmurphyscode Following the same example about the golang package included in the Bitnami SBOM. I guess the same package will be reported twice:
I guess the ideal scenario is to have a mechanism to detect both packages are actually the same one (e.g. by comparing their pURL or similar). With this in mind, are we adding value by labeling these packages as "being from Bitnami"? |
Do we want Grype to be able to match these against the bitnami vulnerability data? In other words, does the bitnami vulnerability data cover these packages? If we just raise it up as a regular Go package, Grype will never know to compare it to the bitnami vulnerability data, but I don't know the scope of that data, so I don't know whether that's what we want. In the example SPDX SBOM above, would you expect a vulnerability scanner to look in Bitnami's database for CVEs agains the |
@willmurphyscode the Bitnami Vulnerability Database only has info about Bitnami packages For instance, render-template is a component we include on several Bitnami images. If we inspect its SPDX file... {
"SPDXID": "SPDXRef-render-template",
(...)
"packages": [
{
"SPDXID": "SPDXRef-render-template",
"name": "render-template",
"versionInfo": "1.0.7-4",
"downloadLocation": "https://github.com/bitnami/render-template/archive/refs/tags/v1.0.7.tar.gz",
"licenseConcluded": "Apache-2.0",
"licenseDeclared": "Apache-2.0",
"filesAnalyzed": false,
"externalRefs": [
{
"referenceCategory": "SECURITY",
"referenceType": "cpe23Type",
"referenceLocator": "cpe:2.3:*:render-template:render-template:1.0.7:*:*:*:*:*:*:*"
},
{
"referenceCategory": "PACKAGE-MANAGER",
"referenceType": "purl",
"referenceLocator": "pkg:bitnami/[email protected]?arch=arm64&distro=debian-12"
}
],
"copyrightText": "NOASSERTION"
},
{
"name": "opt/bitnami/common/bin/render-template",
"SPDXID": "SPDXRef-Application-4b412cf3f25d2574-render-template",
"downloadLocation": "NONE",
"filesAnalyzed": false,
"primaryPackagePurpose": "APPLICATION",
"copyrightText": "NOASSERTION",
"licenseConcluded": "NOASSERTION",
"licenseDeclared": "NOASSERTION"
},
{
"name": "github.com/aymerick/raymond",
"SPDXID": "SPDXRef-Package-c77f44f540ae92a0",
"versionInfo": "v2.0.2+incompatible",
"supplier": "NOASSERTION",
"downloadLocation": "NONE",
"filesAnalyzed": false,
"sourceInfo": "opt/bitnami/common/package found in: opt/bitnami/common/bin/render-template",
"licenseConcluded": "NONE",
"licenseDeclared": "NONE",
"externalRefs": [
{
"referenceCategory": "PACKAGE-MANAGER",
"referenceType": "purl",
"referenceLocator": "pkg:golang/github.com/aymerick/[email protected]%2Bincompatible"
}
],
"primaryPackagePurpose": "LIBRARY",
"copyrightText": "NOASSERTION"
},
(...)
{
"name": "github.com/bitnami/render-template",
"SPDXID": "SPDXRef-Package-8213648cad51225d",
"supplier": "NOASSERTION",
"downloadLocation": "NONE",
"filesAnalyzed": false,
"sourceInfo": "opt/bitnami/common/package found in: opt/bitnami/common/bin/render-template",
"licenseConcluded": "NONE",
"licenseDeclared": "NONE",
"externalRefs": [
{
"referenceCategory": "PACKAGE-MANAGER",
"referenceType": "purl",
"referenceLocator": "pkg:golang/github.com/bitnami/render-template"
}
],
"primaryPackagePurpose": "LIBRARY",
"copyrightText": "NOASSERTION"
},
],
"relationships": [
{
"spdxElementId": "SPDXRef-render-template",
"relationshipType": "CONTAINS",
"relatedSpdxElement": "SPDXRef-Application-4b412cf3f25d2574-render-template"
},
{
"spdxElementId": "SPDXRef-Application-4b412cf3f25d2574-render-template",
"relatedSpdxElement": "SPDXRef-Package-8213648cad51225d",
"relationshipType": "CONTAINS"
},
(...)
{
"spdxElementId": "SPDXRef-Package-8213648cad51225d",
"relatedSpdxElement": "SPDXRef-Package-c77f44f540ae92a0",
"relationshipType": "DEPENDS_ON"
}
} ... we can notice a few things:
If we take a look to the Bitnami Vulnerability Database components (see the link below) we will NOT find any info about the compiled binary nor the golang packages but exclusively about the Bitnami package: |
I see two main alternatives here:
Approach 1 vs approach 2 cons/pros:
|
There might be a specific case where this is likely: native binaries (e.g. ELF files) that were not installed by any package manager. Those are currently challenging to identify, so having bitnami weigh in on them makes sense. Especially if we can get high quality CPEs for Grype's binary matcher to compare against NVD's database.
Syft already does some de-duplication of packages. If the Bitnami cataloger raises up all these extra packages, are you seeing duplicates? In other words if you scan an image with Thanks! |
With the changes I'm proposing at #3341, there are no duplicates. However, this is because I'm only reporting Bitnami packages in the current implementation. If we report every package in the Bitnami SBOM applying this patch... diff --git a/syft/pkg/cataloger/bitnami/cataloger.go b/syft/pkg/cataloger/bitnami/cataloger.go
index bfa4d3c2..0e8e0616 100644
--- a/syft/pkg/cataloger/bitnami/cataloger.go
+++ b/syft/pkg/cataloger/bitnami/cataloger.go
@@ -44,13 +44,8 @@ func parseSBOM(_ context.Context, _ file.Resolver, _ *generic.Environment, reade
var pkgs []pkg.Package
for _, p := range s.Artifacts.Packages.Sorted() {
- // We only want to report Bitnami packages
- if !strings.HasPrefix(p.PURL, "pkg:bitnami") {
- continue
- }
-
p.FoundBy = catalogerName
- p.Type = pkg.BitnamiPkg
+
// replace all locations on the package with the location of the SBOM file.
// Why not keep the original list of locations? Since the "locations" field is meant to capture
// where there is evidence of this file, and the catalogers have not run against any file other than,
@@ -59,13 +54,16 @@ func parseSBOM(_ context.Context, _ file.Resolver, _ *generic.Environment, reade
reader.Location.WithAnnotation(pkg.EvidenceAnnotationKey, pkg.PrimaryEvidenceAnnotation),
)
- // Parse the Bitnami-specific metadata
- metadata, err := parseBitnamiPURL(p.PURL)
- if err != nil {
- return nil, nil, err
- }
+ if strings.HasPrefix(p.PURL, "pkg:bitnami") {
+ p.Type = pkg.BitnamiPkg
+ // Parse the Bitnami-specific metadata
+ metadata, err := parseBitnamiPURL(p.PURL)
+ if err != nil {
+ return nil, nil, err
+ }
- p.Metadata = metadata
+ p.Metadata = metadata
+ }
pkgs = append(pkgs, p)
} ... Duplicates appear: $ go run ./cmd/syft bitnami/apache -o json | jq '.artifacts[] | select(.purl | startswith("pkg:golang/github.com/jessevdk/go-flags"))'
{
"id": "15bd1508bd27b64e",
"name": "github.com/jessevdk/go-flags",
"version": "v1.6.1",
"type": "go-module",
"foundBy": "bitnami-cataloger",
"locations": [
{
"path": "/opt/bitnami/common/.spdx-render-template.spdx",
"layerID": "sha256:6923ab12004885c8d94bdd17626e36e661ddc6f2b159cb48bbfe3681dda3dd0a",
"accessPath": "/opt/bitnami/common/.spdx-render-template.spdx",
"annotations": {
"evidence": "primary"
}
}
],
"licenses": [],
"language": "go",
"cpes": [
{
"cpe": "cpe:2.3:a:jessevdk:go-flags:v1.6.1:*:*:*:*:*:*:*",
"source": "syft-generated"
},
{
"cpe": "cpe:2.3:a:jessevdk:go_flags:v1.6.1:*:*:*:*:*:*:*",
"source": "syft-generated"
}
],
"purl": "pkg:golang/github.com/jessevdk/[email protected]",
"metadataType": "go-module-buildinfo-entry",
"metadata": {
"goCompiledVersion": "",
"architecture": ""
}
}
{
"id": "2e09194e80f282d7",
"name": "github.com/jessevdk/go-flags",
"version": "v1.6.1",
"type": "go-module",
"foundBy": "go-module-binary-cataloger",
"locations": [
{
"path": "/opt/bitnami/common/bin/render-template",
"layerID": "sha256:6923ab12004885c8d94bdd17626e36e661ddc6f2b159cb48bbfe3681dda3dd0a",
"accessPath": "/opt/bitnami/common/bin/render-template",
"annotations": {
"evidence": "primary"
}
}
],
"licenses": [],
"language": "go",
"cpes": [
{
"cpe": "cpe:2.3:a:jessevdk:go-flags:v1.6.1:*:*:*:*:*:*:*",
"source": "syft-generated"
},
{
"cpe": "cpe:2.3:a:jessevdk:go_flags:v1.6.1:*:*:*:*:*:*:*",
"source": "syft-generated"
}
],
"purl": "pkg:golang/github.com/jessevdk/[email protected]",
"metadataType": "go-module-buildinfo-entry",
"metadata": {
"goCompiledVersion": "go1.22.7",
"architecture": "arm64",
"h1Digest": "h1:Cvu5U8UGrLay1rZfv/zP7iLpSHGUZ/Ou68T0iX1bBK4=",
"mainModule": "github.com/bitnami/render-template"
}
} As you can see there are two packages with different "id" and "foundBy" values but almost identical in the rest of fields, except for metadata which is richer on the package reported by "go-module-binary-cataloger". |
Friendly reminder ⬆️ @willmurphyscode |
Thanks for the ping @juan131! I want to have a bit of a discussion about deduplication with the binary classifier: ❯ go run ./cmd/syft -q -o json bitnami/postgresql:17 | jq -c '.artifacts[] | select(.name | test("postgres")) | { name: .name, version: .version, location: .locations[0].path }'
{"name":"postgresql","version":"17.0","location":"/opt/bitnami/postgresql/bin/postgres"}
{"name":"postgresql","version":"17.0.0-8","location":"/opt/bitnami/postgresql/.spdx-postgresql.spdx"} I wonder if it would be preferable or possible to collapse these. The bitnami entry has better version data, and the versions are "compatible" in the sense that bitnami just knows more about the fact that it's postgresql 17.0. I think ideally these would deduplicate, but today they don't because (I believe) the binary classifier sees a slightly less specific version number (it's essentially doing CC @wagoodman |
I think Bitnami should be considered a distro provider like Redhat or Debian, here -- Grype will be using the distro's vulnerability feed and should be matching with the same versioning scheme because we think the information provided to the distro package is the most accurate. We are deduplicating overlapping packages by owned files based on cataloger priority -- if the bitnami entries are cataloged with a specific type, I think the appropriate type just need to get added here and Syft should deduplicate the packages as long as there is an accurate overlap in the owned files from the SPDX entry. |
Pragmatically I think this is the right short term answer in terms of grype's needs. In terms of a more accurate SBOM, we could add a post-cataloging task that looks for package ownership overlap with packages found with the SBOM cataloger and keep more authoritative packages (bitnami over anything else for instance). |
You're the experts here, I can adapt #3341 based on your feedback |
Hi @juan131! Thanks for your patience here. Running #3341 right now on bitnami/postgresql looks like this: go run ./cmd/syft -q bitnami/postgresql | grep -e NAME -e postgres
NAME VERSION TYPE
postgresql 17.1 binary
postgresql 17.1.0-0 bitnami I have a couple questions about this:
In order to merge #3341, we'd like to find a way of collapsing these packages into one package. Right now, by default, if Syft finds for example an RPM that owns the postgres executable and a binary package at the postgres executable, it will, by default, collapse them into one package, deleting the binary detected in favor of the RPM and its richer metadata. We'd like to do something similar with the bitnami SBOMs, but I don't see any reference to the path in the SDPX bitnami puts in the image, so we're not automatically inferring that the postgres executable path is owned by the bitnami package. I see that |
Regarding locations, it's true the SPDX file doesn't include information about the exact location where Bitnami packages (those with pURLs prefixed with
That's the revision, please find at bitnami/go-version our versioning explained. That's why we use this library for version comparison. |
Hi @juan131, That solution sounds reasonable to me - we already have mechanism for de-duplicating binary packages in favor of OS packages when there's file overlap, and we should use them here. For example if someone does This has a couple of steps:
Step 2 is where the logic you're describing lives, which I understand to mean: In the cataloger, if Syft finds an spdx SBOM at As a test case, when running Syft with this change on |
@willmurphyscode I found one issue while implementing your suggestion. The "excludeByFileOwnershipOverlap" approach doesn't work (I still obtain the output below) given the "postgresql" binary package doesn't implement the OwnedFiles interface. $ go run ./cmd/syft bitnami/postgresql -q | grep -e NAME -e postgres
NAME VERSION TYPE
postgresql 17.2 binary
postgresql 17.2.0-5 bitnami To test it, I simply added this piece of code while looping over the pkg collection: log.Debugf("package: %s, type: %s", p.Name, p.Type)
fileOwner, ok := p.Metadata.(pkg.FileOwner)
if !ok {
log.Debug("no owned files")
continue
}
log.Debugf("owned files: %s", fileOwner.OwnedFiles()) As you can see below, even Bitnami package now reports the proper owned files, the Binary package doesn't:
|
What would you like to be added:
As part of anchore/grype#1609, Syft should pick up on sboms in containers located at
/opt/bitnami
because this is how Bitnami records what's in an image.The SBOM cataloger would probably do this already, but is off by default.
There are a few open questions here:
FROM
a Bitnami image? How do we know we can trust the SBOM?Why is this needed:
This is primarily needed so that running
grype
on a Bitnami image (see anchore/grype#1609) is as accurate as possible.Additional context:
There are a few open requests for more accurate Bitnami classification. Ideally this work might also fix those.
The text was updated successfully, but these errors were encountered: