Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Grype scan command appears to hang when downloading db or listing file #1731

Closed
githala-deepak opened this issue Feb 28, 2024 · 29 comments
Closed
Assignees
Labels
changelog-ignore Don't include this issue in the release changelog database Relating to the grype DB asset performance

Comments

@githala-deepak
Copy link

What happened:
Grype command gets stuck and I get the error after 3 hours
failed to load vulnerability db: unable to update vulnerability database: unable to download db: stream error: stream ID 1; INTERNAL_ERROR; received from peer
What you expected to happen:
Grype scan should get completed in under a minute
How to reproduce it (as minimally and precisely as possible):
Occurs randomly, can't reproduce
Anything else we need to know?:

Environment:

  • Output of grype version: Application: grype
    Version: 0.74.5
    BuildDate: 2024-02-07T21:34:47Z
    GitCommit: 7478090
    GitDescription: v0.74.5
    Platform: linux/amd64
    GoVersion: go1.21.6
    Compiler: gc
    Syft Version: v0.104.0
    Supported DB Schema: 5

  • OS (e.g: cat /etc/os-release or similar): PRETTY_NAME="Debian GNU/Linux 11 (bullseye)"
    NAME="Debian GNU/Linux"
    VERSION_ID="11"
    VERSION="11 (bullseye)"
    VERSION_CODENAME=bullseye
    ID=debian
    HOME_URL="https://www.debian.org/"
    SUPPORT_URL="https://www.debian.org/support"
    BUG_REPORT_URL="https://bugs.debian.org/"

@githala-deepak githala-deepak added the bug Something isn't working label Feb 28, 2024
@willmurphyscode
Copy link
Contributor

Hi @githala-deepak,

Thanks for the report.

It sounds like grype is having trouble downloading its updated vulnerability DB, which it will try to do about once per day.

If you run grype db update -vvv, do you seen any errors?

If you download the db directly, with a command like this:

curl -vvv -o /tmp/db.tar.gz 'https://toolbox-data.anchore.io/grype/databases/vulnerability-db_v5_2024-02-28T01:23:28Z_ea5efb77a61bf939917f.tar.gz'

Do you see any errors? Does the download succeed? I think you probably need to troubleshoot a network issue, and that curl command will start you in the right direction.

@hkadakia
Copy link

hkadakia commented Mar 8, 2024

I am having a similar issue.

Syft: Summary of packages by <count> <type>
00:03:13 See mediaimage.syft.json for full package details
00:03:13     122 "go-module"
00:03:13       3 "python"
00:03:13     159 "rpm"
00:03:13 
00:03:13 Grype: scanning for vulnerabilities 
00:03:13 /root/.local/bin/grype -q -o json --config=default-ignore-rules.yaml  --only-fixed  sbom:mediaimage
00:08:06 Killed
SYFT_VER=0.92.0
GRYPE_VER=0.69.1

@mathrock
Copy link

I have recently noticed that occasionally requests to fetch the listing.json file are super slow, like there's a bad/slow backend in rotation. I suspect the same thing is happening fetching the larger tar.gz DB sqlite files, causing the hang that users are reporting.

Additionally it seems as though there is no retry/timeout logic on the db update process, so that may also be an area to look into improving.

Are the DB files located in S3 or in an S3 bucket fronted by Cloudflare? Or just in Cloudflare R2 directly?

Some examples from earlier today if it's helpful for you to look into logs on toolbox-data.anchore.io and diagnose the issue. The initial requests to download the ~ 156KB listing.json file took over 30s!

The following requests were made around Tue, 12 Mar 2024 15:42:00 GMT

[mathrock ~]$ time curl https://toolbox-data.anchore.io/grype/databases/listing.json -o /dev/null
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  159k  100  159k    0     0  1039k      0 --:--:-- --:--:-- --:--:-- 10597

real    0m32.164s
user    0m0.060s
sys     0m0.071s

And then some requests are quick, like we're hitting a bad/slow backend in the rotation:

[mathrock ~]$ time curl https://toolbox-data.anchore.io/grype/databases/listing.json -o /dev/null
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  159k  100  159k    0     0  1039k      0 --:--:-- --:--:-- --:--:-- 1044k

real    0m0.160s
user    0m0.061s
sys     0m0.056s
[mathrock ~]$ time curl https://toolbox-data.anchore.io/grype/databases/listing.json -o /dev/null
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  159k  100  159k    0     0   939k      0 --:--:-- --:--:-- --:--:--  940k

real    0m0.177s
user    0m0.062s
sys     0m0.055s
[mathrock ~]$ time curl https://toolbox-data.anchore.io/grype/databases/listing.json -o /dev/null
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  159k  100  159k    0     0  1004k      0 --:--:-- --:--:-- --:--:-- 1011k

real    0m0.166s
user    0m0.049s
sys     0m0.070s

@willmurphyscode
Copy link
Contributor

Thanks for the detailed info @mathrock! I've also seen grype db updates be slow, but haven't yet figured out why. We're investigating on our end.

@willmurphyscode
Copy link
Contributor

Hi all! Thanks for reporting this.

We've changed some configs with our CDN to try to fix the issue. Since it's only intermittent, it's hard to know for sure that it's fixed, so please let us know if you continue having anymore slowness or hangs with grype database downloads.

We'll also look into putting in some timeouts in grype, since that should prevent the client from hanging regardless of the behavior of the CDN / database download.

I'll leave this issue open while we continue to monitor, and until we have client side timeouts merged.

@willmurphyscode willmurphyscode self-assigned this Apr 2, 2024
@willmurphyscode willmurphyscode moved this to In Progress in OSS Apr 2, 2024
@jcote-tc
Copy link

jcote-tc commented Apr 3, 2024

I'm having the issue today:

[0000] DEBUG checking for available database updates
23
[0000] DEBUG found database update candidate: Listing(url=https://toolbox-data.anchore.io/grype/databases/vulnerability-db_v5_2024-04-03T01:24:31Z_1712118027.tar.gz)
24
[0000] DEBUG cannot find existing metadata, using update...
25
[0000] DEBUG database update available: Listing(url=https://toolbox-data.anchore.io/grype/databases/vulnerability-db_v5_2024-04-03T01:24:31Z_1712118027.tar.gz)
26
[0000]  INFO downloading new vulnerability DB

It's stuck on the last line ^ : "[0000] INFO downloading new vulnerability DB"

@jcote-tc
Copy link

jcote-tc commented Apr 3, 2024

I'm having the issue today:

[0000] DEBUG checking for available database updates
23
[0000] DEBUG found database update candidate: Listing(url=https://toolbox-data.anchore.io/grype/databases/vulnerability-db_v5_2024-04-03T01:24:31Z_1712118027.tar.gz)
24
[0000] DEBUG cannot find existing metadata, using update...
25
[0000] DEBUG database update available: Listing(url=https://toolbox-data.anchore.io/grype/databases/vulnerability-db_v5_2024-04-03T01:24:31Z_1712118027.tar.gz)
26
[0000]  INFO downloading new vulnerability DB

It's stuck on the last line ^ : "[0000] INFO downloading new vulnerability DB"

FYI: It fixed itself after a few hours.

@spiffcs
Copy link
Contributor

spiffcs commented Apr 4, 2024

Hey everyone! Check out the latest release of grype where we now have default timeouts included (user configurable as well).

PR that was merged: #1777

We're currently looking into why the CDN that hosts the listing and db files ever gets into the state where it connects, but fails to transfer the bytes.

@spiffcs spiffcs closed this as completed Apr 4, 2024
@github-project-automation github-project-automation bot moved this from In Progress to Done in OSS Apr 4, 2024
@Fajkowsky
Copy link

@spiffcs Any update on why CDN is acting so slow?

@willmurphyscode
Copy link
Contributor

Hi @Fajkowsky, can you tell us a bit about when you're seeing this slowness?

The only deterministic bit of slowness we've found is when new Grype DBs come out, there's some slowness shortly after, because all the Grype invocations shortly after the new DB is published download the new DB, but after this initial burst of traffic, a large percentage of Grype clients have the new DB cached and the download traffic is greatly reduced. We're looking at ways to put some jitter in there.

So when you see the slow downloads, is it short after 5AM UTC or so? If so, we expect this situation to improve when we introduce some jitter/staggering in when different Grype installs download the new DB.

If it's at a different time, we would really appreciate some more details if you don't mind sharing them, like what time the slow runs were at and what geographic region they're in. (Feel free to join the community slack and DM one of us if you'd rather not post that information publicly.)

@Fajkowsky
Copy link

Hi @willmurphyscode,

Today is the day.
curl -o listing.json https://toolbox-data.anchore.io/grype/databases/listing.json

% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 152k 100 152k 0 0 4974 0 0:00:31 0:00:31 --:--:-- 8563

The transfer is so low I was downloading json file with listings for 31 seconds.

@wagoodman wagoodman reopened this Jul 9, 2024
@wagoodman wagoodman moved this from Done to Ready in OSS Jul 9, 2024
@willmurphyscode
Copy link
Contributor

We also have a similar complaint over on scan-action: anchore/scan-action#306

@willmurphyscode
Copy link
Contributor

Related issue at #1939

@willmurphyscode
Copy link
Contributor

Another related issue at #1885

It seems like a number of users are still having CDN problems after the last round of attempted fixes. We will investigate and see what can be improved on the CDN side.

@willmurphyscode willmurphyscode changed the title Grype scan command gets stuck Grype scan command appears to hang when downloading db or listing file Jul 17, 2024
@willmurphyscode willmurphyscode moved this from Ready to In Progress in OSS Jul 18, 2024
@willmurphyscode
Copy link
Contributor

Hi all!

After some discussion on our Discourse instance we are going to try to reduce the probability that Grype checks for an updated DB by building in a delay where, if Grype's local database was built more recently than N hours ago, Grype should not check whether a new database is available, thus saving a network call. I think N will be configurable, and grype db update and grype db check will always check for a new database.

I'll post an update when this is rolled out and we'll see whether there's some improvement here. Thanks for your patience!

@willmurphyscode
Copy link
Contributor

Hi all,

We have rolled out a change to the DB hosting infrastructure on Grype to reduce the number of bytes Grype needs to download when checking for a new database by about 95%. This change is server-side only, so you don't need to upgrade grype to benefit. We have also set up some metrics on this. So far, the fix seems to have helped. You can read more here.

Please let us know if you're still impacted by slow checks for new grype databases. If the metrics improvements hold for the next week or so, and there aren't new complaints, we'll close this issue.

Thanks for your patience on this one.

@willmurphyscode
Copy link
Contributor

Hi all! Our metrics indicate that the reduced size of the listing file has fixed this problem. There are more details on the measurements we did on the community Discourse.

If we've missed something, please let us know on Discourse or by opening a new issue. Thanks!

@github-project-automation github-project-automation bot moved this from In Review to Done in OSS Aug 9, 2024
@willmurphyscode willmurphyscode added changelog-ignore Don't include this issue in the release changelog and removed bug Something isn't working labels Aug 9, 2024
@sparrowt
Copy link
Contributor

We've been issues with this again (see also #846) e.g. this Tuesday 13th grype tried for nearly 2 hours before giving up:

[2024-08-13T15:26:43.327Z] grype -o json myimage
[2024-08-13T17:16:50.899Z] failed to load vulnerability db: unable to update vulnerability database: unable to download db: stream error: stream ID 1; INTERNAL_ERROR; received from peer

and again today I've got a current invocation which has been stuck for 2h35 and counting without any progress...

[2024-08-15T09:26:07.379Z] grype -o json myimage

Are there ongoing infrastructure issues?

@kzantow
Copy link
Contributor

kzantow commented Aug 15, 2024

There seem to be continued issues downloading the database. See also: anchore/scan-action#306.

As noted earlier, we believe that a change in the size of the file has solved the issues while downloading the listing, but it's not possible to shrink the size of the database in a similar manner, which is now where the failures have moved.

@sparrowt we have not been able to identify any specific issues that are within our power to fix with the current CDN hosting setup we have, unfortunately. We do have a number of options to pursue. But are you using the latest version of Grype? There should be a significantly shorter timeout than 2 hours.

@kzantow kzantow reopened this Aug 15, 2024
@vica-atlassian
Copy link

I was also experiencing this issue with download of todays db not completing.

Workaround: I was able to manually download yesterdays vulnerability db and import it.

I did the following:

to obtain links to dbs: grype db list
to import: grype db import <my_dl>
to check status: grype db status

Hope this helps until the root issue is resolved.

@jdvorak001
Copy link

Many thanks @vica-atlassian for the workaround, it saved me a lot of time.
Just noting that grype skips the vulnerability db auto-update on a regular run if one has GRYPE_DB_AUTO_UPDATE=false in the environment.

Funny: I was just able to wget the yesterday's vulnerability database in 18 secs (~ 10.6 MiB/sec).
At the same time my wget download of today's vulnerability database is "running" with speeds only sometimes reaching 16 KiB/s.

@jerry-brimacombe-talogy

We are also having the same issue. It started yesterday morning. We had been using an old version of Grype, so updated it to the latest version. The problem seemed to be intermittent and resolved itself. However, it is now happening consistently again

Starting: Grype txt
==============================================================================
Task         : Bash
Description  : Run a Bash script on macOS, Linux, or Windows
Version      : 3.241.1
Author       : Microsoft Corporation
Help         : https://docs.microsoft.com/azure/devops/pipelines/tasks/utility/bash
==============================================================================
Generating script.
Script contents:
grype -vv sbom:/home/vsts/work/1/s/Syft/sbom.syft.json -o table --file /home/vsts/work/1/s/Grype/Grype.txt
========================== Starting Command Output ===========================
/usr/bin/bash /home/vsts/work/_temp/2f1ddfbf-89c5-49e4-87f5-f75b87924a98.sh
[0000]  INFO grype version: 0.79.6
[0000] DEBUG config:
  log:
      quiet: false
      level: debug
      file: ""
  dev:
      profile: none
  output:
      - table
  file: /home/vsts/work/1/s/Grype/Grype.txt
  distro: ""
  add-cpes-if-none: false
  output-template-file: ""
  check-for-app-update: true
  only-fixed: false
  only-notfixed: false
  ignore-states: ""
  platform: ""
  search:
      scope: squashed
      unindexed-archives: false
      indexed-archives: true
  ignore: []
  exclude: []
  db:
      cache-dir: /home/vsts/.cache/grype/db
      update-url: https://toolbox-data.anchore.io/grype/databases/listing.json
      ca-cert: ""
      auto-update: true
      validate-by-hash-on-start: false
      validate-age: true
      max-allowed-built-age: 120h0m0s
      update-available-timeout: 30s
      update-download-timeout: 5m0s
  external-sources:
      enable: false
      maven:
          search-upstream: true
          base-url: https://search.maven.org/solrsearch/select
  match:
      java:
          using-cpes: false
      dotnet:
          using-cpes: false
      golang:
          using-cpes: false
          always-use-cpe-for-stdlib: true
          allow-main-module-pseudo-version-comparison: false
      javascript:
          using-cpes: false
      python:
          using-cpes: false
      ruby:
          using-cpes: false
      rust:
          using-cpes: false
      stock:
          using-cpes: true
  fail-on-severity: ""
  registry:
      insecure-skip-tls-verify: false
      insecure-use-http: false
      auth: []
      ca-cert: ""
  show-suppressed: false
  by-cve: false
  name: ""
  default-image-pull-source: ""
  vex-documents: []
  vex-add: []
  match-upstream-kernel-headers: false
[0000] DEBUG gathering packages
[0000] DEBUG loading DB
[0000] DEBUG looking for updates on vulnerability database
[0000] DEBUG checking for available database updates
[0000] DEBUG found database update candidate: Listing(url=https://toolbox-data.anchore.io/grype/databases/vulnerability-db_v5_2024-08-16T01:31:16Z_1723782141.tar.gz)
[0000] DEBUG cannot find existing metadata, using update...
[0000] DEBUG database update available: Listing(url=https://toolbox-data.anchore.io/grype/databases/vulnerability-db_v5_2024-08-16T01:31:16Z_1723782141.tar.gz)
[0000]  INFO downloading new vulnerability DB
[0000]  WARN unknown package metadata type="" for packageID="e6be0d4f844469d7"
[0000]  WARN unknown package metadata type="" for packageID="11cf22f38884a9f6"
[0000]  WARN unknown package metadata type="" for packageID="4c065ad0e08c491d"
[0000]  WARN unknown package metadata type="" for packageID="b0ef2d2f58efbbf2"
[0000]  WARN unknown package metadata type="" for packageID="6d3e17d18015d4e5"
[0000]  WARN unknown package metadata type="" for packageID="4f2335411f9a94ed"
[0000]  WARN unknown package metadata type="" for packageID="6ce4c9e99bd67541"
[0000]  WARN unknown package metadata type="" for packageID="acd3097a7b0561ce"
[0000]  WARN unknown package metadata type="" for packageID="d597c9a1945c3418"
[0000]  WARN unknown package metadata type="" for packageID="c75f2825cf7e5f3d"
[0000]  WARN unknown package metadata type="" for packageID="e48a9ab765a6199a"
[0000]  WARN unknown package metadata type="" for packageID="3733bd2d1ce41916"
[0000]  WARN unknown package metadata type="" for packageID="9e2dfa7c8112b0c7"
[0000]  WARN unknown package metadata type="" for packageID="422bfa9c24bf5633"
[0000]  WARN unknown package metadata type="" for packageID="f78abd837e737350"
[0000]  WARN unknown package metadata type="" for packageID="496d62ab2bc12063"
[0000]  WARN unknown package metadata type="" for packageID="44408cae1116a7b2"
[0000]  WARN unknown package metadata type="" for packageID="82306554a557c33e"
[0000]  WARN unknown package metadata type="" for packageID="c7dc4f9a7ba95622"
[0000]  WARN unknown package metadata type="" for packageID="02c6a1a99e13c60e"
[0000]  WARN unknown package metadata type="" for packageID="5293e357cbe89b83"
[0000]  WARN unknown package metadata type="" for packageID="dc4fb70ff578156b"
[0000]  WARN unknown package metadata type="" for packageID="c2ec9d1ac081abfc"
[0000]  WARN unknown package metadata type="" for packageID="84a055a16b5290c8"
[0000]  WARN unknown package metadata type="" for packageID="fb11a92f25850efe"
[0000]  WARN unknown package metadata type="" for packageID="31ff2337a538b070"
[0000]  WARN "relationship mapping to key 5dc118795491eefb is not a valid artifact.Identifiable type: <nil>" occurred 1 time(s)
[0000]  WARN "relationship mapping to key 7c6b421e95d44a3f is not a valid artifact.Identifiable type: <nil>" occurred 1 time(s)
[0000]  WARN "relationship mapping to key a741299b9444f760 is not a valid artifact.Identifiable type: <nil>" occurred 1 time(s)
[0000]  WARN "relationship mapping to key ad80ce239dae3cf7 is not a valid artifact.Identifiable type: <nil>" occurred 1 time(s)
[0000]  WARN "relationship mapping to key eea3fdf9f969ce47 is not a valid artifact.Identifiable type: <nil>" occurred 1 time(s)
[0000]  WARN "unknown relationship type: described-by" occurred 26 time(s)
[0000]  WARN "relationship mapping to key 59e0554aa6ff0ea9 is not a valid artifact.Identifiable type: <nil>" occurred 1 time(s)
[0000]  WARN "relationship mapping to key ab5dbb2608786c3f is not a valid artifact.Identifiable type: <nil>" occurred 1 time(s)
[0000]  WARN "relationship mapping to key ad152d106ad5e365 is not a valid artifact.Identifiable type: <nil>" occurred 1 time(s)
[0000]  WARN "relationship mapping to key dec5e791fb2e6143 is not a valid artifact.Identifiable type: <nil>" occurred 1 time(s)
[0000]  WARN "relationship mapping to key e9e6568c68f68749 is not a valid artifact.Identifiable type: <nil>" occurred 1 time(s)
[0000]  WARN "relationship mapping to key 3410ecaef637e6f1 is not a valid artifact.Identifiable type: <nil>" occurred 1 time(s)
[0000] DEBUG no new grype update available
failed to load vulnerability db: unable to update vulnerability database: unable to download db: context deadline exceeded (Client.Timeout or context cancellation while reading body)

##[error]Bash exited with code '1'.
Finishing: Grype txt

@danb-csms
Copy link

Our team is also experiencing this issue with current day DB not completing the download (with previous day DB working fine).

We are looking at workarounds :(

@jonathanbro
Copy link

Hey Everyone. Can GRYPE_DB_AUTO_UPDATE=false be used as an input in GH actions? I don't see it listed? https://github.com/anchore/scan-action

@kzantow
Copy link
Contributor

kzantow commented Aug 16, 2024

Hey Everyone. Can GRYPE_DB_AUTO_UPDATE=false be used as an input in GH actions? I don't see it listed? https://github.com/anchore/scan-action

@jonathanbro yes, the action supports all grype settings via GRYPE_-prefixed environment variables. For example:

      - uses: anchore/scan-action@v4
        id: grype-scan
        with:
          image: alpine:3.15
        env:
          GRYPE_CONFIG: ./my-config.yml
          GRYPE_ONLY_FIXED: true

Run: grype config to get a full list of the configuration. scan-action should be using the latest Grype.

@popey
Copy link
Contributor

popey commented Aug 17, 2024

For those affected by this issue, the team deployed the changes to how the grype vulnerability database is served late last night (UK time). So, runs should now no longer exhibit the same network stalling.

Please report if you see any further issues.

@jdvorak001
Copy link

Confirming it works now for me.

@jerry-brimacombe-talogy

It's also working for us now. Thank you for your speedy action.

@kzantow
Copy link
Contributor

kzantow commented Aug 19, 2024

Hi all, we've made a change to our database hosting that we believe should fix these issues, there is some more information on Discourse

@kzantow kzantow closed this as completed Aug 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
changelog-ignore Don't include this issue in the release changelog database Relating to the grype DB asset performance
Projects
Archived in project
Development

No branches or pull requests