Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Binary-Artifacts false positives #1256

Closed
evverx opened this issue Nov 13, 2021 · 21 comments
Closed

Binary-Artifacts false positives #1256

evverx opened this issue Nov 13, 2021 · 21 comments
Labels
kind/bug Something isn't working

Comments

@evverx
Copy link
Contributor

evverx commented Nov 13, 2021

As far as I can tell, the Binary-Artifacts check was supposed to catch executables people can run unintentionally after cloning repositories but in reality it seems to just look for extensions and, for example, flags projects using binary files for testing purposes. (The check would be even more noisy if it ignored extensions and searched for magic numbers by analogy with file because it would effectively penalize projects for keeping, for example, regression tests generated by fuzz targets (a lot of which look like executables or, maybe, even kind of executables if ELF files are in seed corpora). As an example, below is what the Binary-Artifacts says about systemd:

{
  "date": "2021-11-11",
  "repo": {
    "name": "github.com/systemd/systemd",
    "commit": "9cc615460830afdb51ad23e594906bbe60a3b25a"
  },
  "scorecard": {
    "version": "3.1.1-57-g8da30e6",
    "commit": "8da30e63afbc62e25c2c1252003aed9d34c3d04d"
  },
  "score": 0.0,
  "checks": [
    {
      "details": [
        "Warn: binary detected: test/dmidecode-dumps/HP-Z600.bin",
        "Warn: binary detected: test/dmidecode-dumps/Lenovo-ThinkPad-X280.bin",
        "Warn: binary detected: test/dmidecode-dumps/Lenovo-Thinkcentre-m720s.bin"
      ],
      "score": 0,
      "reason": "binaries present in source code",
      "name": "Binary-Artifacts",
      "documentation": {
        "url": "https://github.com/ossf/scorecard/blob/8da30e63afbc62e25c2c1252003aed9d34c3d04d/docs/checks.md#binary-artifacts",
        "short": "Determines if the project has generated executable (binary) artifacts in the source repository."
      }
    }
  ],
  "metadata": null
}
@laurentsimon
Copy link
Contributor

laurentsimon commented Nov 15, 2021

Thanks for the feedback. Scorecard looks for both magic number and extension.
We check for binaries because the code cannot be reviewed and it may contain arbitrary code run by the user, as you say.

We discard binaries in folders named testdata, which you may use for unit tests, regression tests, etc. This is not documented so I'll add a note in the documentation. Do you think this is acceptable?

@evverx
Copy link
Contributor Author

evverx commented Nov 15, 2021

Scorecard looks for both magic number and extension

Judging by https://github.com/h2non/filetype#supported-types filetype.Get isn't exactly useful when it comes to detecting executables that can be run by accident so effectively scorecard almost always falls back to file extensions.

the code cannot be reviewed

If files like that are a part of a testsuite they're supposed to be reviewed automatically by the tests. I'm not exactly sure how else to review test cases provided by OSS-Fuzz for example.

We discard binaries in folders named testdata

I think it's mostly a go convention and I don't think it would be feasible to change codebases written in other languages to make scorecard happy.

Do you think this is acceptable?

Given that for example LGTM (https://lgtm.com/help/lgtm/file-classification) and codeql get it right somehow I don't think it is.

Anyway, I think it's just another check that penalizes projects with extensive testsuites.

@laurentsimon
Copy link
Contributor

laurentsimon commented Nov 15, 2021

Scorecard looks for both magic number and extension

Judging by https://github.com/h2non/filetype#supported-types filetype.Get isn't exactly useful when it comes to detecting executables that can be run by accident so effectively scorecard almost always falls back to file extensions.

the code cannot be reviewed

If files like that are a part of a testsuite they're supposed to be reviewed automatically by the tests. I'm not exactly sure how else to review test cases provided by OSS-Fuzz for example.

agreed. We need a way to filter them out, which is why we ignore testdata today.

We discard binaries in folders named testdata

I think it's mostly a go convention and I don't think it would be feasible to change codebases written in other languages to make scorecard happy.

We could add more folder/file names to filter out binaries, but there will always be some false positives this way. Would you like to have the list hardcoded, or given as option to scorecard? An option allows users to run it with the convention for the language they use, if there is any; and with better knowledge of the repo.
What would be a good starting point for a list of folder/file names to discard?

Do you think this is acceptable?

Given that for example LGTM (https://lgtm.com/help/lgtm/file-classification) and codeql get it right somehow I don't think it is.

Anyway, I think it's just another check that penalizes projects with extensive testsuites.

@evverx
Copy link
Contributor Author

evverx commented Nov 15, 2021

We could add more folder/file names to filter out binaries, but there will always be some false positives this way

I think ideally the check should start utilizing libmagic used by the file utility to really look for binary executables. If it was used, systemd wouldn't be flagged by scorecard and it wouldn't be even necessary to filter out the tests:

   1653 C source, ASCII text
   1516 ASCII text
    421 C source, UTF-8 Unicode text
    268 XML 1.0 document, ASCII text
    170 Bourne-Again shell script, ASCII text executable
    124 XML 1.0 document, UTF-8 Unicode text
     92 data
     75 UTF-8 Unicode text
     34 GNU gettext message catalogue, UTF-8 Unicode text
     24 Python script, ASCII text executable
     21 POSIX shell script, ASCII text executable
     17 ASCII text, with very long lines
     14 JSON data
     11 XML 1.0 document, UTF-8 Unicode text, with very long lines
     10 Python script, UTF-8 Unicode text executable
      9 awk or perl script, ASCII text
      8 empty
      8 Bourne-Again shell script, UTF-8 Unicode text executable
      7 PDP-11 UNIX/RT ldp
      7 HTML document, ASCII text
      6 ASCII text, with no line terminators
      5 UTF-8 Unicode text, with very long lines
      5 C source, UTF-8 Unicode text, with very long lines
      3 UTF-8 Unicode text, with CRLF line terminators
      3 POSIX shell script, UTF-8 Unicode text executable
      3 MIPSEB Ucode
      3 exported SGML document, ASCII text
      2 XML 1.0 document, ASCII text, with very long lines
      2 very short file (no magic)
      2 SVG Scalable Vector Graphics image
      2 makefile script, ASCII text
      2 ISO-8859 text
      2 HTML document, UTF-8 Unicode text
      2 GNU gettext message catalogue, ASCII text
      2 C source, ASCII text, with very long lines
      1 Web Open Font Format, TrueType, length 42844, version 2.0
      1 Web Open Font Format, TrueType, length 42672, version 0.0
      1 unified diff output, UTF-8 Unicode text
      1 Python script, UTF-8 Unicode text executable, with very long lines
      1 Python script, ASCII text executable, with very long lines
      1 PNG image data, 16 x 16, 8-bit/color RGBA, non-interlaced
      1 PGP Secret Sub-key -
      1 PGP/GPG key public ring (v4) created Wed Jul  9 03
      1 Perl script text executable
      1 PDP-11 old overlay
      1 PDP-11 kernel overlay
      1 PC bitmap, Windows 98/2000 and newer format, 295 x 245 x 32
      1 Non-ISO extended-ASCII text, with very long lines
      1 Non-ISO extended-ASCII text
      1 ISO-8859 text, with very long lines
      1 ISO-8859 text, with no line terminators
      1 exported SGML document, UTF-8 Unicode text

What would be a good starting point for a list of folder/file names to discard?

I'm not sure. It would be great if there was a way figure out how LGTM and codeql filter out tests. Unfortunately it seems their scripts classifying files aren't open source (or at least I haven't been able to find them yet)

@evverx
Copy link
Contributor Author

evverx commented Nov 16, 2021

Would you like to have the list hardcoded, or given as option to scorecard? An option allows users to run it with the convention for the language they use, if there is any; and with better knowledge of the repo.

I think by default scorecard should produce meaningful metrics without any additional options to let people run it against different repositories written in different languages. Options would probably help if it was necessary to relax or tighten the check on a case-by-case basic should the need arise.

@evverx
Copy link
Contributor Author

evverx commented Nov 16, 2021

One last thing. As long as scorecard can be tricked into thinking that there are no binary artifacts by simply removing file extensions I don't think it should have the "HIGH" priority and affect the global score that much.

@laurentsimon
Copy link
Contributor

Judging by https://github.com/h2non/filetype#supported-types filetype.Get isn't exactly useful when it comes to detecting executables that can be run by accident so effectively scorecard almost always falls back to file extensions.

interesting. Our intention was to use the libmagic equivalent in go-land. @naveensrinivasan knows amore about h2non/filetype. I thought it detected ELF files. Right now we always fall back to using extension if filetype returns nothing. You're saying maybe we should not do that...?

I'm not sure. It would be great if there was a way figure out how LGTM and codeql filter out tests. Unfortunately it seems their scripts classifying files aren't open source (or at least I haven't been able to find them yet)

Do they try to detect ELF files? If they are actually running things, I suppose they can strace and see if it's opened by test files and so on. For non-C code, codeQl can figure this out accurately using the CFG, I believe.

can be tricked into thinking that there are no binary artifacts by simply removing file extensions I don't think it should have the "HIGH" priority and affect the global score that much.

@naveensrinivasan is working on unit tests in general. Let's investigate this further.

@evverx
Copy link
Contributor Author

evverx commented Nov 16, 2021

I thought it detected ELF files

As far as I can tell it does. I meant those ".bin" files that would just disappear from the scorecard radar if I dropped the ".bin" extension.

Right now we always fall back to using extension if filetype returns nothing. You're saying maybe we should not do that...?

I think it depends on how clever the library is. If it was libmagic I think it would make sense to ignore extensions but I've never used h2non/filetype

@evverx
Copy link
Contributor Author

evverx commented Nov 16, 2021

I'm not sure if it helps but here's all the types the library was able to recognize in the systemd repository

   4542 unknown
      2 woff
      1 png
      1 bmp

@laurentsimon
Copy link
Contributor

good catch, thanks for reporting this. I'll let @naveensrinivasan comment.

@evverx
Copy link
Contributor Author

evverx commented Nov 16, 2021

I've just opened #1279 based on my understanding of how the library works

@evverx
Copy link
Contributor Author

evverx commented Nov 16, 2021

FWIW looking at h2non/filetype#79 I don't think the library is fuzzed on a regular basis so it isn't clear to me how safe it is to run scorecard against random repositories :-)

@evverx
Copy link
Contributor Author

evverx commented Nov 16, 2021

Do they try to detect ELF files? If they are actually running things, I suppose they can strace and see if it's opened by test files and so on.

I'm not sure about codeql but LGTM doesn't run any tests. As far as I know, LGTM had a script keeping track of files build scripts tried to open to install missing dependencies though. Anyway, I'm 50% sure whatever is kept in directories named *test* is considered tests there.

@evverx
Copy link
Contributor Author

evverx commented Nov 16, 2021

I closed that PR mostly due to h2non/filetype#73. I'm not sure what to do about it. Looks like in its current form the library can't be used to get deterministic results.

@laurentsimon
Copy link
Contributor

mhhh that's not good. We'll investigate if there's another lib we could use.
I have an idea to reduce false positive. My take away frmo your issue is that .bin is too generic in general and does not map directly to a "binary that contains code and may be run by users". What if we removed the bin from the list of extension we look for? The other extensions are more specific, so it's unlikely users would use them to store arbitrary files like fuzzing input, etc. wdut?

There are exceptions, e.g. the binutils repo may still do poorly.

@evverx
Copy link
Contributor Author

evverx commented Nov 17, 2021

What if we removed the bin from the list of extension we look for?

Sounds good to me. I've just opened #1288

laurentsimon pushed a commit that referenced this issue Nov 17, 2021
Those files most likely contain binary data used by tests for
example. It should be safe to remove this because executables
disguised as ".bin" files will still be caught and flagged by scorecard
before it even have a chance to look at extensions.

It should address #1256
@laurentsimon
Copy link
Contributor

shall we close this issue?

@evverx
Copy link
Contributor Author

evverx commented Dec 3, 2021

I don't think scorecard should complain about tests so I think I'd keep the issue open

@westonsteimel
Copy link

westonsteimel commented Dec 10, 2021

Judging by https://github.com/h2non/filetype#supported-types filetype.Get isn't exactly useful when it comes to detecting executables that can be run by accident so effectively scorecard almost always falls back to file extensions.

gabriel-vasile/mimetype seems to have a decent list of supported types and I know it is used in anchore/stereoscope and in turn by syft and grype

@evverx
Copy link
Contributor Author

evverx commented Dec 23, 2021

To judge from #1408 (comment), it should be addressed by allowing specifying paths in config files and while I don't think it's ideal it's one way to do it and it has its advantages. To level the field though I think scorecard shouldn't filter the "testdata" directories because it affects the "raw" results.

@evverx evverx closed this as completed Dec 23, 2021
@laurentsimon
Copy link
Contributor

To judge from #1408 (comment), it should be addressed by allowing specifying paths in config files and while I don't think it's ideal it's one way to do it and it has its advantages. To level the field though I think scorecard shouldn't filter the "testdata" directories because it affects the "raw" results.

good point. We'll remove it once we have the raw results ready.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants