Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance scoping selections #15

Open
wagoodman opened this issue May 13, 2020 · 11 comments
Open

Enhance scoping selections #15

wagoodman opened this issue May 13, 2020 · 11 comments
Assignees
Labels
enhancement New feature or request

Comments

@wagoodman
Copy link
Contributor

Add the following user scope selections:

  • Hidden Scope: all layers - squashed
  • User Scope: all layers - base layer
  • User Squashed Scope: squashed - base layer
  • Custom Scope: user selects which layers they care about (e.g. "1,5-7,9")
@wagoodman
Copy link
Contributor Author

In order to continue with these package de duplication needs to be concluded first (or concurrently) (see #32)

@wagoodman
Copy link
Contributor Author

note: package de-dup is done, so this should be unblocked 🥳

@wagoodman
Copy link
Contributor Author

There is an extra vote for some of these via #1035

@kzantow
Copy link
Contributor

kzantow commented Aug 10, 2023

Some other related asks:

I believe if we included each layer that a component is present in the locations, in order, that matches the order the container was built, we could be able to answer both of the questions posed:

  • Remove all packages with layer X (e.g. the base layer; this could be removed and re-added, though)
  • Determine all packages which are on the final layer

There would have to be a change somewhere or possibly a new scope to do this (which seems like all-layers might work like this by default -- some aspect of this may actually be done already) such that we don't include layers where we see files introduced but rather we include all layers where files are present.

@wagoodman
Copy link
Contributor Author

I think there is one unsolved problem with this that needs to be addressed early in the design: how will we deal with multiple packages stored in a single file? It could look like that large sets of packages were introduced together in a single layer, when in fact they were introduced across layers. (e.g. RPMs and the RPM DB)

@tomerse-sg
Copy link

can be a very useful feature!

@spiffcs
Copy link
Contributor

spiffcs commented May 23, 2024

I think we've talked about this as something that exists at the resolver layer which means syft needs the original source when performing the operation that calculates (User Squashed Scope: squashed - base layer).

One interesting approach we could take is generalizing it more as a subtraction of SBOM from SBOM.

Example:

syft -o json <base:image>
syft -o json <final:image>
<final-sbom> = <final:image:sbom> - <base:image:sbom>

When syft runs and a user provides the option for User Squashed Scope the operation done by syft would be to generate that base layer SBOM, generate the full squashed scope, and then take a diff/patch against both documents.

This opens the design up so that syft could have a mode where it computes User Squashed Scope for two given SBOM inputs if the source is no longer available for analysis.

@kzantow
Copy link
Contributor

kzantow commented May 23, 2024

We could accomplish something similar using the current all-layers scope, if we could identify all packages present in the base layer and the final squashed layer, subtracting the packages present in the base layer. Extending this, we could potentially reduce the all-layers to only include 2 specific layers: one for the base and one for the squashed. But ultimately this would be essentially a similar amount of scanning to just running twice on each squashed layer to get 2 separate SBOMs and subtracting them. That is to say, I think the SBOM subtraction makes a lot of sense and we could add a mode to Syft to generate the base SBOM from a specific squashed layer automagically if it isn't provided by the user.

@tomersein
Copy link
Contributor

is it planned to be deployed?

@TimBrown1611
Copy link

is it possible to do so without scanning twice the same image?
today I see each task gets one resolver, so it can be hard to do all-layers - squashed

@tomersein
Copy link
Contributor

please look at this pr and let me know what you think - #3138

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: Backlog
Development

No branches or pull requests

6 participants