Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows Docker image analysis techniques #238

Closed
JonoYang opened this issue Jul 15, 2021 · 3 comments
Closed

Windows Docker image analysis techniques #238

JonoYang opened this issue Jul 15, 2021 · 3 comments
Assignees

Comments

@JonoYang
Copy link
Member

The Docker pipeline in scanpipe should be extended to analyze Windows Docker images in the same way we analyze Debian Docker images. Since Windows does not use a central package manager or any other type of system to manage installed packages, we must use a variety of techniques to find installed packages.

  1. Report MSI installer files as packages.
  2. Report installed programs from Windows registry files
  3. Report Microsoft Update Manifest (.mum) files
  4. Ignore uninteresting files
    • We should tag files like desktop.ini, Thumbs.db, registry files, etc. as uninteresting and something that can be ignored because they do not contain information about third-party packages
  5. Check common file locations for third-party software
    • For example, Python installs itself to the root drive under the Python27 or Python<version> directory. openjdk is similar. We can have a collection of these paths
  6. Use the NSRL (https://www.nist.gov/itl/ssd/software-quality-group/national-software-reference-library-nsrl) to match known files from software packages
    • We can check file hashes against this database
  7. Parse swid files and swidtags
@JonoYang JonoYang self-assigned this Jul 15, 2021
JonoYang added a commit that referenced this issue Jul 16, 2021
    * Update docstrings
    * Pin fetchcode dep

Signed-off-by: Jono Yang <[email protected]>
JonoYang added a commit that referenced this issue Jul 16, 2021
JonoYang added a commit that referenced this issue Jul 17, 2021
JonoYang added a commit that referenced this issue Jul 26, 2021
JonoYang added a commit that referenced this issue Jul 26, 2021
JonoYang added a commit that referenced this issue Jul 26, 2021
    * Update docstrings
    * Pin fetchcode dep

Signed-off-by: Jono Yang <[email protected]>
JonoYang added a commit that referenced this issue Jul 26, 2021
JonoYang added a commit that referenced this issue Jul 26, 2021
JonoYang added a commit that referenced this issue Jul 26, 2021
JonoYang added a commit that referenced this issue Jul 26, 2021
JonoYang added a commit that referenced this issue Jul 26, 2021
@JonoYang
Copy link
Member Author

Regarding #6, the NSRL does not seem to index the container version of Windows. The latest version of Windows on record in the collection is the ISO contents of version 1903. Most of the indexed Windows system files are for versions released before 2019.

JonoYang added a commit that referenced this issue Jul 28, 2021
Signed-off-by: Jono Yang <[email protected]>
JonoYang added a commit that referenced this issue Jul 29, 2021
    * Modify regex used for Windows container analysis so it can be used outside the context of a Windows Docker image
    * Update tests

Signed-off-by: Jono Yang <[email protected]>
JonoYang added a commit that referenced this issue Jul 29, 2021
    * Create pipes that ignore media files and data files with no clues
    * Update test results

Signed-off-by: Jono Yang <[email protected]>
JonoYang added a commit that referenced this issue Jul 29, 2021
    * Create pipes that ignore media files and data files with no clues
    * Update test results

Signed-off-by: Jono Yang <[email protected]>
JonoYang added a commit that referenced this issue Jul 30, 2021
JonoYang added a commit that referenced this issue Jul 30, 2021
JonoYang added a commit that referenced this issue Jul 30, 2021
JonoYang added a commit that referenced this issue Jul 30, 2021
JonoYang added a commit that referenced this issue Jul 30, 2021
Signed-off-by: Jono Yang <[email protected]>
JonoYang added a commit that referenced this issue Aug 2, 2021
    * Update docstrings
    * Pin fetchcode dep

Signed-off-by: Jono Yang <[email protected]>
JonoYang added a commit that referenced this issue Aug 2, 2021
JonoYang added a commit that referenced this issue Aug 2, 2021
JonoYang added a commit that referenced this issue Aug 2, 2021
JonoYang added a commit that referenced this issue Aug 2, 2021
JonoYang added a commit that referenced this issue Aug 2, 2021
JonoYang added a commit that referenced this issue Aug 2, 2021
Signed-off-by: Jono Yang <[email protected]>
JonoYang added a commit that referenced this issue Aug 2, 2021
    * Modify regex used for Windows container analysis so it can be used outside the context of a Windows Docker image
    * Update tests

Signed-off-by: Jono Yang <[email protected]>
JonoYang added a commit that referenced this issue Aug 2, 2021
    * Create pipes that ignore media files and data files with no clues
    * Update test results

Signed-off-by: Jono Yang <[email protected]>
JonoYang added a commit that referenced this issue Aug 2, 2021
JonoYang added a commit that referenced this issue Aug 2, 2021
Signed-off-by: Jono Yang <[email protected]>
JonoYang added a commit that referenced this issue Aug 2, 2021
    * Use InstalledWindowsProgram object instead of Package

Signed-off-by: Jono Yang <[email protected]>
JonoYang added a commit that referenced this issue Aug 2, 2021
    * Update tests with more paths to test regex patterns

Signed-off-by: Jono Yang <[email protected]>
JonoYang added a commit that referenced this issue Aug 2, 2021
    * Update tests with more paths to test regex patterns

Signed-off-by: Jono Yang <[email protected]>
JonoYang added a commit that referenced this issue Aug 2, 2021
    * Update tests with more paths to test regex patterns

Signed-off-by: Jono Yang <[email protected]>
JonoYang added a commit that referenced this issue Aug 3, 2021
    * Use re.match instead of re.split
    * Rename WindowsDocker pipeline to DockerWindows
    * Set the default value of the q_objects argument for tag_installed_package_files to be a tuple

Signed-off-by: Jono Yang <[email protected]>
JonoYang added a commit that referenced this issue Aug 3, 2021
    * Update test results

Signed-off-by: Jono Yang <[email protected]>
JonoYang added a commit that referenced this issue Aug 3, 2021
    * Update test results

Signed-off-by: Jono Yang <[email protected]>
JonoYang added a commit that referenced this issue Aug 3, 2021
    * Update test

Signed-off-by: Jono Yang <[email protected]>
tdruez added a commit that referenced this issue Aug 4, 2021
Signed-off-by: Thomas Druez <[email protected]>
tdruez added a commit that referenced this issue Aug 4, 2021
* Use newer version of container libraries

Signed-off-by: Philippe Ombredanne <[email protected]>

* Use new container-inspector structures

Signed-off-by: Philippe Ombredanne <[email protected]>

* Add minimal support for Windows containers

Signed-off-by: Philippe Ombredanne <[email protected]>

* Update Windows package getter

    * The windows_helper module from scancode is not available on pypi

Signed-off-by: Jono Yang <[email protected]>

* Use newer version of container libraries

Signed-off-by: Philippe Ombredanne <[email protected]>

* Update call to windows_helper to win_reg

Signed-off-by: Jono Yang <[email protected]>

* Create new pipeline for Windows Docker images

    * Create Windows specific tag_uninteresting_windows_codebase_resources function

Signed-off-by: Jono Yang <[email protected]>

* Add function to find packages at well-known paths

    * Update tests

Signed-off-by: Jono Yang <[email protected]>

* Add step to tag known software in pipeline

    * Change name of Docker step from "find_images_linux_distro" to "find_images_os_and_distro"

Signed-off-by: Jono Yang <[email protected]>

* Get version from path in tag_known_software #238

    * Update docstrings
    * Pin fetchcode dep

Signed-off-by: Jono Yang <[email protected]>

* Troubleshoot regex patterns #238

Signed-off-by: Jono Yang <[email protected]>

* Report Program File contents as packages #238

Signed-off-by: Jono Yang <[email protected]>

* Update Windows-specific regex

    * Add more file names and file extensions to be ignored
    * Update expected test results

Signed-off-by: Jono Yang <[email protected]>

* Do not ignore .mui files #238

Signed-off-by: Jono Yang <[email protected]>

* Filter using extension field rather than path #238

Signed-off-by: Jono Yang <[email protected]>

* Update scanpipe/pipes/docker.py

Create issue to track extraction issue

See #251

Signed-off-by: Philippe Ombredanne <[email protected]>

* Fix scancode-toolkit pinned version in base.txt #238

Signed-off-by: Jono Yang <[email protected]>

* Create pipeline step to tag ignorable files #252

Signed-off-by: Jono Yang <[email protected]>

* Update formatting #238

Signed-off-by: Jono Yang <[email protected]>

* Generalize regex expressions #238

    * Modify regex used for Windows container analysis so it can be used outside the context of a Windows Docker image
    * Update tests

Signed-off-by: Jono Yang <[email protected]>

* Create new pipes for ignoring files #238

    * Create pipes that ignore media files and data files with no clues
    * Update test results

Signed-off-by: Jono Yang <[email protected]>

* Add more file extensions to ignore #238

Signed-off-by: Jono Yang <[email protected]>

* Bump dep versions #238

Signed-off-by: Jono Yang <[email protected]>

* Update docstring #238

    * Use InstalledWindowsProgram object instead of Package

Signed-off-by: Jono Yang <[email protected]>

* Improve regex used in tag_known_software #238

    * Update tests with more paths to test regex patterns

Signed-off-by: Jono Yang <[email protected]>

* Adjust code for consistency across the codebase #181

Signed-off-by: Thomas Druez <[email protected]>

* Address PR comments #238

    * Use re.match instead of re.split
    * Rename WindowsDocker pipeline to DockerWindows
    * Set the default value of the q_objects argument for tag_installed_package_files to be a tuple

Signed-off-by: Jono Yang <[email protected]>

* Add is_media field to CodebaseResource #238

    * Update test results

Signed-off-by: Jono Yang <[email protected]>

* Simplify tag_media_files_as_unintersting() #238

    * Update test

Signed-off-by: Jono Yang <[email protected]>

* Refine windows pipes #238

Signed-off-by: Thomas Druez <[email protected]>

Co-authored-by: Jono Yang <[email protected]>
Co-authored-by: Thomas Druez <[email protected]>
@tdruez
Copy link
Contributor

tdruez commented Aug 4, 2021

@JonoYang can we close this one?

@JonoYang
Copy link
Member Author

JonoYang commented Aug 5, 2021

@tdruez Yep. These features have been merged in.

@JonoYang JonoYang closed this as completed Aug 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants