-
Notifications
You must be signed in to change notification settings - Fork 576
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provide a way to get the LayerID the package was first found in #435
Comments
The work proposed in #32 aims to deduplicate packages in a way where the same package found in multiple locations would be listed as a single package and have multiple entries on the |
@wagoodman I see #32 was closed. I was checking it out and it looks good i dont think it really solves this problem though (though its a little more helpful). The issue still exist if i run Right now what im having to do to get around this is run syft with |
I think there is a path forward on this one. We would need to create a new image-based FileResolver that would act a little like the squashed resolver and the all-layers resolver. The squashed resolver returns a location for all paths in the squashed representation. The all-layers resolver returns one or more locations to the all paths in all layers. We really want something that would return all locations from all layers for all paths in the squashed representation. In this way the catalogers would have visibility into all places where the file was introduced/changed and the existing downstream package merging logic would account for packages that are the same and found in the same path across multiple layers. This could be selectable by a new scope like From an implementation point of view, this would look an awful lot like the existing all-layers resolver today with an additional filtering step based on a query to the squashed representation. The catalogers would catalog all location instances, raising up duplicates, and the set of duplicates would be merged. The single merged package would have This means that for a dpkg that was added on layer 1, but other packages were installed in other (future) layers, since there is a shared database there would be a location added to the package for every layer which the database file was modified from the starting layer (when the package was installed) moving forward. This case is a little awkward, but is accurate relative to what syft understands about the package, and seems like a good first step. |
what is the status of this request? can be very useful :) |
please look at this pr - #3138 |
A flag that would include the layerID the package first showed up in.
When tracking down a package (maybe b/c it has vulnerabilities or Im not sure why its in my SBOM) it would be helpful to know what layer it first showed up in so I can look at the commands run to generate that layer.
Currently it looks like the layerID returned under locations is the last layerID the path was touched in. So for example if I did something like:
The package “busybox” version "1.32.1-r6" has a layerID of layer2. I know there is the “–scope all-layers” option but that would also return any packages that were removed from the final image.
The text was updated successfully, but these errors were encountered: