Skip to content

chore: refactor provenance level 3 check into analysis #817

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 23 commits into from
Mar 11, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
119cf3b
chore: refactor level 3 provenance check
benmss Jan 13, 2025
6f1f4d9
chore: add docs
benmss Jan 13, 2025
e446b9b
chore: minor fix
benmss Jan 13, 2025
08eb999
chore: add SLSA version value to Provenance table
benmss Jan 14, 2025
f054ef8
chore: minor fix
benmss Jan 14, 2025
ce5dc16
chore: account for witness provenance
benmss Jan 15, 2025
a2d54d4
chore: pylint
benmss Jan 15, 2025
6a114b4
chore: store provenance as String type
benmss Jan 16, 2025
a1e56a6
chore: use build type function; remove old slsa verifier check
benmss Jan 16, 2025
9ff8a50
chore: update comments
benmss Feb 3, 2025
c1d47d1
chore: silence bugged pylint check with comment
benmss Feb 17, 2025
9af137a
chore: remove pylint check suppression
benmss Feb 17, 2025
8cc0348
chore: rename inferred provenance
benmss Feb 18, 2025
29b4d70
chore: make provenance verification flag a command line option and di…
benmss Mar 5, 2025
6d00430
chore: minor fix
benmss Mar 5, 2025
dc35afc
chore: add new verify provenance command to integration tests
benmss Mar 5, 2025
a72ac05
chore: prevent l3 verify with command option, and update relevant tests
benmss Mar 7, 2025
54c1c20
chore: update docs and tutorial
benmss Mar 7, 2025
5ae92f0
chore: replace l3 check with provenance verify and slsa level 3 in po…
benmss Mar 7, 2025
a7cbda8
chore: finish removing provenance level 3 check
benmss Mar 7, 2025
963b783
chore: remove references to removed check
benmss Mar 7, 2025
e4890ce
chore: remove check from docs
benmss Mar 8, 2025
864dfac
chore: address PR feedback
benmss Mar 10, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions docs/source/pages/cli_usage/command_analyze.rst
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,11 @@ Options

The path to the local .m2 directory. If this option is not used, Macaron will use the default location at $HOME/.m2

.. option:: --verify-provenance

Allow the analysis to attempt to verify provenance files as part of its normal operations.


-----------
Environment
-----------
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
macaron.provenance package
==========================

.. automodule:: macaron.provenance
:members:
:undoc-members:
:show-inheritance:

Submodules
----------

macaron.provenance.provenance\_extractor module
-----------------------------------------------

.. automodule:: macaron.provenance.provenance_extractor
:members:
:undoc-members:
:show-inheritance:

macaron.provenance.provenance\_finder module
--------------------------------------------

.. automodule:: macaron.provenance.provenance_finder
:members:
:undoc-members:
:show-inheritance:

macaron.provenance.provenance\_verifier module
----------------------------------------------

.. automodule:: macaron.provenance.provenance_verifier
:members:
:undoc-members:
:show-inheritance:
Original file line number Diff line number Diff line change
Expand Up @@ -17,22 +17,6 @@ macaron.repo\_finder.commit\_finder module
:undoc-members:
:show-inheritance:

macaron.repo\_finder.provenance\_extractor module
-------------------------------------------------

.. automodule:: macaron.repo_finder.provenance_extractor
:members:
:undoc-members:
:show-inheritance:

macaron.repo\_finder.provenance\_finder module
----------------------------------------------

.. automodule:: macaron.repo_finder.provenance_finder
:members:
:undoc-members:
:show-inheritance:

macaron.repo\_finder.repo\_finder module
----------------------------------------

Expand Down
1 change: 1 addition & 0 deletions docs/source/pages/developers_guide/apidoc/macaron.rst
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ Subpackages
macaron.output_reporter
macaron.parsers
macaron.policy_engine
macaron.provenance
macaron.repo_finder
macaron.repo_verifier
macaron.slsa_analyzer
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -89,14 +89,6 @@ macaron.slsa\_analyzer.checks.provenance\_commit\_check module
:undoc-members:
:show-inheritance:

macaron.slsa\_analyzer.checks.provenance\_l3\_check module
----------------------------------------------------------

.. automodule:: macaron.slsa_analyzer.checks.provenance_l3_check
:members:
:undoc-members:
:show-inheritance:

macaron.slsa\_analyzer.checks.provenance\_l3\_content\_check module
-------------------------------------------------------------------

Expand Down
4 changes: 2 additions & 2 deletions docs/source/pages/tutorials/npm_provenance.rst
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ To perform an analysis on the latest version of semver (when this tutorial was w

.. code-block:: shell

./run_macaron.sh analyze -purl pkg:npm/[email protected]
./run_macaron.sh analyze -purl pkg:npm/[email protected] --verify-provenance

The analysis involves Macaron downloading the contents of the target repository to the configured, or default, ``output`` folder. Results from the analysis, including checks, are stored in the database found at ``output/macaron.db`` (See :ref:`Output Files Guide <output_files_guide>`). Once the analysis is complete, Macaron will also produce a report in the form of a HTML file.

Expand All @@ -52,7 +52,7 @@ During this analysis, Macaron will retrieve two provenance files from the npm re

.. note:: Most of the details from the two provenance files can be found through the links provided on the artifacts page on the npm website. In particular: `Sigstore Rekor <https://search.sigstore.dev/?logIndex=92391688>`_. The provenance file itself can be found at: `npm registry <https://registry.npmjs.org/-/npm/v1/attestations/[email protected]>`_.

Of course to reliably say the above does what is claimed here, proof is needed. For this we can rely on the check results produced from the analysis run. In particular, we want to know the results of three checks: ``mcn_provenance_derived_repo_1``, ``mcn_provenance_derived_commit_1``, and ``mcn_provenance_verified_1``. The first two to ensure that the commit and the repository being analyzed match those found in the provenance file, and the last check to ensure that the provenance file has been verified.
Of course to reliably say the above does what is claimed here, proof is needed. For this we can rely on the check results produced from the analysis run. In particular, we want to know the results of three checks: ``mcn_provenance_derived_repo_1``, ``mcn_provenance_derived_commit_1``, and ``mcn_provenance_verified_1``. The first two to ensure that the commit and the repository being analyzed match those found in the provenance file, and the last check to ensure that the provenance file has been verified. For the third check to succeed, you need to enable provenance verification in Macaron by using the ``--verify-provenance`` command-line argument, as demonstrated above. This verification is disabled by default because it can be slow in some cases due to I/O-bound operations.

.. _fig_semver_7.6.2_report:

Expand Down
17 changes: 12 additions & 5 deletions src/macaron/__main__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Copyright (c) 2022 - 2024, Oracle and/or its affiliates. All rights reserved.
# Copyright (c) 2022 - 2025, Oracle and/or its affiliates. All rights reserved.
# Licensed under the Universal Permissive License v 1.0 as shown at https://oss.oracle.com/licenses/upl/.

"""This is the main entrypoint to run Macaron."""
Expand Down Expand Up @@ -32,7 +32,6 @@

def analyze_slsa_levels_single(analyzer_single_args: argparse.Namespace) -> None:
"""Run the SLSA checks against a single target repository."""
deps_depth = None
if analyzer_single_args.deps_depth == "inf":
deps_depth = -1
else:
Expand Down Expand Up @@ -173,7 +172,8 @@ def analyze_slsa_levels_single(analyzer_single_args: argparse.Namespace) -> None
analyzer_single_args.sbom_path,
deps_depth,
provenance_payload=prov_payload,
validate_malware_switch=analyzer_single_args.validate_malware_switch,
validate_malware=analyzer_single_args.validate_malware,
verify_provenance=analyzer_single_args.verify_provenance,
)
sys.exit(status_code)

Expand Down Expand Up @@ -360,7 +360,7 @@ def main(argv: list[str] | None = None) -> None:
help="The directory where Macaron looks for already cloned repositories.",
)

# Add sub parsers for each action
# Add sub parsers for each action.
sub_parser = main_parser.add_subparsers(dest="action", help="Run macaron <action> --help for help")

# Use Macaron to analyze one single repository.
Expand Down Expand Up @@ -470,12 +470,19 @@ def main(argv: list[str] | None = None) -> None:
)

single_analyze_parser.add_argument(
"--validate-malware-switch",
"--validate-malware",
required=False,
action="store_true",
help=("Enable malware validation."),
)

single_analyze_parser.add_argument(
"--verify-provenance",
required=False,
action="store_true",
help=("Allow the analysis to attempt to verify provenance files as part of its normal operations."),
)

# Dump the default values.
sub_parser.add_parser(name="dump-defaults", description="Dumps the defaults.ini file to the output directory.")

Expand Down
7 changes: 1 addition & 6 deletions src/macaron/config/defaults.ini
Original file line number Diff line number Diff line change
Expand Up @@ -46,11 +46,6 @@ validate = True
# The CycloneDX schema version used for validation.
schema = 1.6

# This is the Analyzer section used as part of Macaron's analysis.
[analyzer]
# This enables or disables attempts at verification of provenance.
verify_provenance = True

# This is the repo finder script.
[repofinder]
find_repos = True
Expand Down Expand Up @@ -569,7 +564,7 @@ purl_endpoint = v3alpha/purl
# [analysis.checks]
# exclude =
# mcn_build_as_code_1
# mcn_provenance_level_three_1
# mcn_provenance_verified_1
# include = *
# ```
# 3. Exclude multiple checks that start with `mcn_provenance`:
Expand Down
75 changes: 69 additions & 6 deletions src/macaron/database/db_custom_types.py
Original file line number Diff line number Diff line change
@@ -1,13 +1,21 @@
# Copyright (c) 2023 - 2024, Oracle and/or its affiliates. All rights reserved.
# Copyright (c) 2023 - 2025, Oracle and/or its affiliates. All rights reserved.
# Licensed under the Universal Permissive License v 1.0 as shown at https://oss.oracle.com/licenses/upl/.

"""This module implements SQLAlchemy type for converting date format to RFC3339 string representation."""
"""This module implements SQLAlchemy types for Python data types that cannot be automatically stored."""

import datetime
import json
from typing import Any

from sqlalchemy import JSON, String, TypeDecorator

from macaron.slsa_analyzer.provenance.intoto import (
InTotoPayload,
InTotoV01Payload,
InTotoV1Payload,
validate_intoto_payload,
)


class RFC3339DateTime(TypeDecorator): # pylint: disable=W0223
"""
Expand Down Expand Up @@ -36,7 +44,7 @@ def process_bind_param(self, value: None | Any, dialect: Any) -> None | str:
if the provided ``datetime`` is a naive ``datetime`` object then UTC is added.

value: None | datetime.datetime
The value being stored
The value being stored.
"""
if value is None:
return None
Expand All @@ -52,7 +60,7 @@ def process_result_value(self, value: None | str, dialect: Any) -> None | dateti
If the deserialized ``datetime`` has a timezone then return it, otherwise add UTC as its timezone.

value: None | str
The value being loaded
The value being loaded.
"""
if value is None:
return None
Expand All @@ -76,7 +84,7 @@ def process_bind_param(self, value: None | dict, dialect: Any) -> None | dict:
"""Process when storing a dict object to the SQLite db.

value: None | dict
The value being stored
The value being stored.
"""
if not isinstance(value, dict):
raise TypeError("DBJsonDict type expects a dict.")
Expand All @@ -87,8 +95,63 @@ def process_result_value(self, value: None | dict, dialect: Any) -> None | dict:
"""Process when loading a dict object from the SQLite db.

value: None | dict
The value being loaded
The value being loaded.
"""
if not isinstance(value, dict):
raise TypeError("DBJsonDict type expects a dict.")
return value


class ProvenancePayload(TypeDecorator): # pylint: disable=W0223
"""SQLAlchemy column type to serialize InTotoProvenance."""

# It is stored in the database as a String value.
impl = String

# To prevent Sphinx from rendering the docstrings for `cache_ok`, make this docstring private.
#: :meta private:
cache_ok = True

def process_bind_param(self, value: InTotoPayload | None, dialect: Any) -> str | None:
"""Process when storing an InTotoPayload object to the SQLite db.

value: InTotoPayload | None
The value being stored.
"""
if value is None:
return None

if not isinstance(value, InTotoPayload):
raise TypeError("ProvenancePayload type expects an InTotoPayload.")

payload_type = value.__class__.__name__
payload_dict = {"payload_type": payload_type, "payload": value.statement}
return json.dumps(payload_dict)

def process_result_value(self, value: str | None, dialect: Any) -> InTotoPayload | None:
"""Process when loading an InTotoPayload object from the SQLite db.

value: str | None
The value being loaded.
"""
if value is None:
return None

try:
payload_dict = json.loads(value)
except ValueError as error:
raise TypeError(f"Error parsing str as JSON: {error}") from error

if not isinstance(payload_dict, dict):
raise TypeError("Parsed data is not a dict.")

if "payload_type" not in payload_dict or "payload" not in payload_dict:
raise TypeError("Missing keys in dict for ProvenancePayload type.")

payload = payload_dict["payload"]
if payload_dict["payload_type"] == "InTotoV01Payload":
return InTotoV01Payload(statement=payload)
if payload_dict["payload_type"] == "InTotoV1Payload":
return InTotoV1Payload(statement=payload)

return validate_intoto_payload(payload)
20 changes: 16 additions & 4 deletions src/macaron/database/table_definitions.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@

from macaron.artifact.maven import MavenSubjectPURLMatcher
from macaron.database.database_manager import ORMBase
from macaron.database.db_custom_types import RFC3339DateTime
from macaron.database.db_custom_types import ProvenancePayload, RFC3339DateTime
from macaron.errors import InvalidPURLError
from macaron.repo_finder.repo_finder_enums import CommitFinderInfo, RepoFinderInfo
from macaron.slsa_analyzer.provenance.intoto import InTotoPayload, ProvenanceSubjectPURLMatcher
Expand Down Expand Up @@ -491,16 +491,28 @@ class Provenance(ORMBase):
component: Mapped["Component"] = relationship(back_populates="provenance")

#: The SLSA version.
version: Mapped[str] = mapped_column(String, nullable=False)
slsa_version: Mapped[str] = mapped_column(String, nullable=True)

#: The SLSA level.
slsa_level: Mapped[int] = mapped_column(Integer, default=0)

#: The release tag commit sha.
release_commit_sha: Mapped[str] = mapped_column(String, nullable=True)

#: The release tag.
release_tag: Mapped[str] = mapped_column(String, nullable=True)

#: The provenance payload content in JSON format.
provenance_json: Mapped[str] = mapped_column(String, nullable=False)
#: The repository URL from the provenance.
repository_url: Mapped[str] = mapped_column(String, nullable=True)

#: The commit sha from the provenance.
commit_sha: Mapped[str] = mapped_column(String, nullable=True)

#: The provenance payload.
provenance_payload: Mapped[InTotoPayload] = mapped_column(ProvenancePayload, nullable=False)

#: The verified status of the provenance.
verified: Mapped[bool] = mapped_column(Boolean, nullable=False, default=False)

#: A one-to-many relationship with the release artifacts.
artifact: Mapped[list["ReleaseArtifact"]] = relationship(back_populates="provenance")
Expand Down
4 changes: 4 additions & 0 deletions src/macaron/provenance/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# Copyright (c) 2024 - 2025, Oracle and/or its affiliates. All rights reserved.
# Licensed under the Universal Permissive License v 1.0 as shown at https://oss.oracle.com/licenses/upl/.

"""This package contains the provenance tools for software components."""
Loading
Loading