Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Could an error recognising distro on initial scan cause a permanent problem #622

Open
paulaldridge opened this issue May 31, 2022 · 1 comment
Labels
triaged The maintainers have seen this issue

Comments

@paulaldridge
Copy link
Contributor

Not sure if this is an issue or not but found an interesting situation which I thought was worth sharing:

  • Found that clair wasn’t recognising the distro of a Debian image (stretch-slim)
  • Manually inspecting the image showed me the files and text clair was looking for were present, so it should be finding a distro
  • Deleted the layer_scanned records from the database, and re-pushed an image, and clair correctly found the distro

So what was happening was that the layer (FROM debian:stretch-slim) was being skipped each time, as clair had already scanned it, which meant it was fixed in it’s decision that there was no recognisable repo. I’m not sure what caused clair to mess up the initial scan, but it’s concerning that it might be able to happen. I’m not sure how we’d know when it does, or even how we’d sensibly trigger a rescan if we do know - think you’d need to clear the related layer_scanned and manifest_scanned records and then re-push the manifest to clair again for each effected image. 



Onto how/whether it could happen, I have 2 hunches:

  1. When looking for a distro, e.g. debian, any error that is returned from the Files function is assumed to be because none of the requested files are found. But there seems to be a variety of potential errors which may not necessarily mean the file doesn’t exist in the layer (e.g. an error from reader: fail to fetch layer, or failure to open tar). If all of these errors are permanent, and so re-scanning the layers wouldn’t ever help, then assuming the files don’t exist/aren’t available does seem correct. However, if they may be transient errors maybe we should fail the scan on some of these, as to not commit a bad scan to the database forever.
  2. Another possibility is that someone from our team deleted the distro record for this layer from the db manually by mistake (as this was in our dev db where some manually playing has happened). Even if this could be the case, thought it was worth discussing the possibility of a gap where bad scans being could be committed to db
@hdonnay hdonnay added the triaged The maintainers have seen this issue label Jun 9, 2022
@paulaldridge
Copy link
Contributor Author

Just recording that I've noticed this again with another layer. In all but one of our environments an image was showing as having no distro from the index report. The one environment that was working correctly identified the distro from the FROM alpine:latest layer. Querying against a specific layer hash (as alpine:latest isn't a fixed layer) I can see that the working environment has a distro marked for that layer, however other environments have no distro listed for the layer hash.

This seems to support the theory that an error during distro recognition on initial scan could incorrectly mark a layer as having no distro.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triaged The maintainers have seen this issue
Development

No branches or pull requests

2 participants