Skip to content

Commit

Permalink
feat: replace lockfile detection with phylum status
Browse files Browse the repository at this point in the history
The changes here are centered around the use of `phylum status` for
project and dependency file information. The `phylum status` command for
printing project and lockfile details was added in CLI v5.1.0 but the
fix to search for manifests' lockfiles in parent, rather than child
directories was added in CLI v5.6.0, making this the new minimum CLI
version required for new and existing installs.

Now that both lockfiles and manifests are supported, the "lockfile"
language has been changed to the more general term "dependency file"
where it is externally visible (e.g., log and help output) but kept as
"lockfile" internally (e.g., code/variable names).

It is still not clear what to do for the situations where dependency
files require more context for lockfile generation than just the
previous/base version of that same file.

Actions taken include:

* Use `phylum status` for common tasks instead of custom code to
  * look for dependency files in `.phylum_project`
  * detect the dependency files present
  * acquire the initial project file backup
* Downgrade the minimum CLI version required for new/existing installs
* Remove the straddling code for `phylum analyze` command usage
  * No longer needed now that the minimum CLI version is > v5.3.1-rc1
* Remove the CINone pre-requisite
  * Needing to run from the root of a git repository is no longer true
* Provide warning message when attempting to create a Phylum project
  outside the top-level of a `git` repository
* Update externally visible `lockfile` language throughout
  * Use `dependency file` instead, to represent lockfiles and manifests
* Format and refactor throughout
  * Improve log messages

Closes #244
  • Loading branch information
maxrake committed Oct 4, 2023
1 parent 79a5fbe commit bef93fa
Show file tree
Hide file tree
Showing 9 changed files with 119 additions and 187 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/phylum_analyze_pr.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# This is a workflow for analyzing dependency lockfiles
# This is a workflow for analyzing dependency files
# in this repository with Phylum during pull requests.
---
name: Phylum_analyze
Expand Down
4 changes: 2 additions & 2 deletions .pre-commit-hooks.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@
# See https://pre-commit.com for more information
---
- id: phylum
name: analyze lockfiles with phylum
description: Run `phylum` on dependency lockfiles
name: analyze dependencies with phylum
description: Run `phylum` on dependency files
entry: phylum-ci
language: python
stages: [commit]
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -136,7 +136,7 @@ The options for `phylum-init`, automatically updated to be current for the lates

#### `phylum-ci` Script Entry Point

The `phylum-ci` script is for analyzing lockfile changes.
The `phylum-ci` script is for analyzing dependency file (lockfiles and manifests) changes.
The script can be used locally or from within a Continuous Integration (CI) environment.
It will attempt to detect the CI platform based on the environment from which it is run and act accordingly.
The current CI platforms/environments supported are:
Expand All @@ -150,8 +150,8 @@ The current CI platforms/environments supported are:
|Git `pre-commit` Hooks|[Documentation][precommit_docs]|

There is also support for local use. This is the "fall-through" case used when no other environment is detected.
This can be useful to analyze lockfiles locally, prior to or after submitting a pull/merge request (PR/MR) to a CI
system. It can also help in establishing a successful submission prior to submitting a PR/MR to a CI system.
This can be useful to analyze dependency files locally, prior to or after submitting a pull/merge request (PR/MR) to a
CI system. It can also help in establishing a successful submission prior to submitting a PR/MR to a CI system.
Additionally, local use can aid troubleshooting after submitting a PR/MR to a CI system and getting unexpected results.

The options for `phylum-ci`, automatically updated to be current for the latest release:
Expand Down
178 changes: 72 additions & 106 deletions src/phylum/ci/ci_base.py

Large diffs are not rendered by default.

10 changes: 1 addition & 9 deletions src/phylum/ci/ci_none.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,6 @@
"""
import argparse
from functools import cached_property
from pathlib import Path
import re
import subprocess
from typing import Optional
Expand All @@ -28,17 +27,10 @@ def _check_prerequisites(self) -> None:
"""Ensure the necessary pre-requisites are met and bail when they aren't.
These are the current pre-requisites for when no CI environments/platforms is detected:
* Run the script from the root of a git repository
* (None at this time)
"""
super()._check_prerequisites()

git_dir = Path.cwd() / ".git"
if git_dir.is_dir():
LOG.debug("Existing [code].git[/] directory found at the current working directory", extra={"markup": True})
else:
msg = "This script expects to be run from the top level of a `git` repository"
raise SystemExit(msg)

@cached_property
def phylum_label(self) -> str:
"""Get a custom label for use when submitting jobs for analysis."""
Expand Down
16 changes: 8 additions & 8 deletions src/phylum/ci/ci_precommit.py
Original file line number Diff line number Diff line change
Expand Up @@ -54,31 +54,31 @@ def _check_prerequisites(self) -> None:
if sorted(staged_files) == sorted(self.extra_args):
LOG.debug("The extra args provided exactly match the list of staged files")
if any(lockfile.path in extra_arg_paths for lockfile in self.lockfiles):
LOG.info("Valid pre-commit scenario found: lockfile(s) found in extra arguments")
LOG.info("Valid pre-commit scenario found: dependency file(s) found in extra arguments")
return
LOG.warning("A lockfile is not included in extra args. Nothing to do. Exiting ...")
LOG.warning("A dependency file is not included in extra args. Nothing to do. Exiting ...")
sys.exit(0)

# Allow for a pre-commit config set up to filter the files sent to the hook
if all(extra_arg in staged_files for extra_arg in self.extra_args):
LOG.debug("All the extra args are staged files")
if any(lockfile.path in extra_arg_paths for lockfile in self.lockfiles):
LOG.info("Valid pre-commit scenario found: lockfile(s) found in extra arguments")
LOG.info("Valid pre-commit scenario found: dependency file(s) found in extra arguments")
return
LOG.warning("A lockfile is not included in extra args. Nothing to do. Exiting ...")
LOG.warning("A dependency file is not included in extra args. Nothing to do. Exiting ...")
sys.exit(0)

# Allow for cases where the lockfile is included or explicitly specified (e.g., `pre-commit run --all-files`)
if any(lockfile.path in extra_arg_paths for lockfile in self.lockfiles):
LOG.info("A lockfile was included in the extra args")
LOG.info("A dependency file was included in the extra args")
# NOTE: There is still the case where a lockfile is "accidentally" included as an extra argument. For example,
# `phylum-ci poetry.lock` was used instead of `phylum-ci --lockfile poetry.lock`, which is bad syntax but
# nonetheless results in the `CIPreCommit` environment used instead of `CINone`. This is not terrible; it
# just might be a slightly confusing corner case. It might be possible to use a library like `psutil` to
# acquire the command line from the parent process and inspect it for `pre-commit` usage. That is a
# heavyweight solution and one that will not be pursued until the need for it is more clear.
else:
LOG.warning("A lockfile was not included in the extra args...possible invalid pre-commit scenario")
LOG.warning("A dependency file was not included in the extra args...possible invalid pre-commit scenario")
LOG.error("Unrecognized arguments: [code]%s[/]", " ".join(self.extra_args), extra={"markup": True})
sys.exit(0)

Expand Down Expand Up @@ -112,10 +112,10 @@ def is_any_lockfile_changed(self) -> bool:
staged_files = [Path(staged_file).resolve() for staged_file in self.extra_args]
for lockfile in self.lockfiles:
if lockfile.path in staged_files:
LOG.debug("The lockfile [code]%r[/] has changed", lockfile, extra={"markup": True})
LOG.debug("The dependency file [code]%r[/] has changed", lockfile, extra={"markup": True})
lockfile.is_lockfile_changed = True
else:
LOG.debug("The lockfile [code]%r[/] has [b]NOT[/] changed", lockfile, extra={"markup": True})
LOG.debug("The dependency file [code]%r[/] has [b]NOT[/] changed", lockfile, extra={"markup": True})
lockfile.is_lockfile_changed = False
return any(lockfile.is_lockfile_changed for lockfile in self.lockfiles)

Expand Down
18 changes: 9 additions & 9 deletions src/phylum/ci/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -100,17 +100,17 @@ def get_args(args: Optional[Sequence[str]] = None) -> Tuple[argparse.Namespace,
help="Decrease output verbosity (the maximum is -qq)",
)

analysis_group = parser.add_argument_group(title="Lockfile Analysis Options")
analysis_group = parser.add_argument_group(title="Dependency File Analysis Options")
analysis_group.add_argument(
"-l",
"--lockfile",
type=pathlib.Path,
action="append",
nargs="*",
help="""Path to the package lockfile(s) to analyze. If not specified here or in the `.phylum_project` file, an
attempt will be made to automatically detect the lockfile(s). Some lockfile types (e.g., Python/pip
`requirements.txt`) are ambiguous in that they can be named differently and may or may not contain strict
dependencies. In these cases, it is best to specify an explicit lockfile path.""",
help="""Path to package dependency file(s) (lockfiles or manifests) to analyze. If not specified here or in the
`.phylum_project` file, an attempt will be made to automatically detect the file(s). Some lockfile types
(e.g., Python/pip `requirements.txt`) are ambiguous in that they can be named differently and may or may
not contain strict dependencies. In these cases, it is best to specify an explicit dependency file path.""",
)
analysis_group.add_argument(
"-a",
Expand All @@ -122,7 +122,7 @@ def get_args(args: Optional[Sequence[str]] = None) -> Tuple[argparse.Namespace,
"-f",
"--force-analysis",
action="store_true",
help="Specify this flag to force analysis, even when no lockfile has changed.",
help="Specify this flag to force analysis, even when no dependency file has changed.",
)
analysis_group.add_argument(
"-k",
Expand Down Expand Up @@ -192,15 +192,15 @@ def main(args: Optional[Sequence[str]] = None) -> int:
ci_env = detect_ci_platform(parsed_args, remainder_args)

# Bail early when possible
LOG.debug("Lockfiles in use: %s", ci_env.lockfiles)
LOG.debug("Dependency files in use: %s", ci_env.lockfiles)
if ci_env.force_analysis:
LOG.info("Forced analysis specified with flag or otherwise set. Proceeding with analysis ...")
elif ci_env.is_any_lockfile_changed:
LOG.info("A lockfile has changed. Proceeding with analysis ...")
LOG.info("A dependency file has changed. Proceeding with analysis ...")
elif ci_env.phylum_comment_exists:
LOG.info("Existing Phylum comment found. Proceeding with analysis ...")
else:
LOG.warning("No lockfile has changed. Nothing to do.")
LOG.warning("No dependency file has changed. Nothing to do.")
return 0

# Generate a label to use for analysis and report it
Expand Down
27 changes: 17 additions & 10 deletions src/phylum/ci/lockfile.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,16 @@
"""Define a lockfile implementation.
A Phylum project can consist of multiple lockfiles.
This class represents a single lockfile.
This module/class represents a single lockfile.
For historical purposes, this module uses "lockfile" language instead of the
more general term "dependency file." Both lockfiles and manifests are supported
and the language has been changed where it is externally visible (e.g., log and
help output) but kept as "lockfile" internally (e.g., code/variable names).
"""
from functools import cached_property, lru_cache
import json
import os
from pathlib import Path
import shutil
import subprocess
Expand Down Expand Up @@ -37,10 +43,11 @@ def __init__(self, provided_lockfile: Path, cli_path: Path, common_ancestor_comm

def __repr__(self) -> str:
"""Return a debug printable string representation of the `Lockfile` object."""
# `PurePath.relative_to()` requires `self` to be the subpath of the argument, but `os.path.relpath()` does not.
# NOTE: Any change from this format should be made carefully as caller's
# may be relying on `repr(lockfile)` to provide the relative path.
# Example: print(f"Relative path to lockfile: `{lockfile!r}`") # noqa: ERA001 ; commented code intended
return str(self.path.relative_to(Path.cwd()))
return os.path.relpath(self.path)

def __str__(self) -> str:
"""Return the nicely printable string representation of the `Lockfile` object."""
Expand Down Expand Up @@ -85,7 +92,7 @@ def base_deps(self) -> Packages:
"""
prev_lockfile_object = self.previous_lockfile_object()
if not prev_lockfile_object:
LOG.info("No previous lockfile object found for `%r`. Assuming no base dependencies.", self)
LOG.info("No previous dependency file object found for `%r`. Assuming no base dependencies.", self)
return []
prev_lockfile_packages = sorted(set(self.get_previous_lockfile_packages(prev_lockfile_object)))
return prev_lockfile_packages
Expand All @@ -103,7 +110,7 @@ def new_deps(self) -> Packages:

prev_lockfile_object = self.previous_lockfile_object()
if not prev_lockfile_object:
LOG.debug("No previous lockfile object found for `%r`. Assuming all current packages are new.", self)
LOG.debug("No previous dependency file object found for `%r`. Assuming all current packages are new.", self)
return curr_lockfile_packages

prev_lockfile_packages = self.get_previous_lockfile_packages(prev_lockfile_object)
Expand All @@ -125,7 +132,7 @@ def current_lockfile_packages(self) -> Packages:
cmd = [str(self.cli_path), "parse", str(self.path)]
parse_result = subprocess.run(cmd, check=True, capture_output=True, text=True).stdout.strip() # noqa: S603
except subprocess.CalledProcessError as err:
msg = f"Is [reverse]{self!r}[/] a valid lockfile? If so, please report this as a bug."
msg = f"Is [reverse]{self!r}[/] a valid dependency file? If so, please report this as a bug."
raise PhylumCalledProcessError(err, msg) from err
parsed_pkgs = json.loads(parse_result)
curr_lockfile_packages = [PackageDescriptor(**pkg) for pkg in parsed_pkgs]
Expand All @@ -147,8 +154,8 @@ def previous_lockfile_object(self) -> Optional[str]:
except subprocess.CalledProcessError as err:
# There could be a true error, but the working assumption when here is a previous version does not exist
msg = """\
There [italic]may[/] be an issue with the attempt to get the previous lockfile object.
Continuing with the assumption a previous lockfile version does not exist ..."""
There [italic]may[/] be an issue with the attempt to get the previous dependency file object.
Continuing with the assumption a previous dependency file version does not exist ..."""
pprint_subprocess_error(err)
LOG.warning(textwrap.dedent(msg), extra={"markup": True})
prev_lockfile_object = None
Expand All @@ -170,7 +177,7 @@ def get_previous_lockfile_packages(self, prev_lockfile_object: str) -> Packages:
prev_lockfile_fd.flush()
except subprocess.CalledProcessError as err:
pprint_subprocess_error(err)
LOG.error("Due to error, assuming no previous lockfile packages. Please report this as a bug.")
LOG.error("Due to error, assuming no previous dependency file packages. Please report this as a bug.")
return []
try:
cmd = [str(self.cli_path), "parse", prev_lockfile_fd.name]
Expand All @@ -183,9 +190,9 @@ def get_previous_lockfile_packages(self, prev_lockfile_object: str) -> Packages:
except subprocess.CalledProcessError as err:
pprint_subprocess_error(err)
msg = f"""\
Due to error, assuming no previous lockfile packages.
Due to error, assuming no previous dependency file packages.
Please report this as a bug if you believe [code]{self!r}[/]
is a valid lockfile at revision [code]{self.common_ancestor_commit}[/]."""
is a valid dependency file at revision [code]{self.common_ancestor_commit}[/]."""
LOG.warning(textwrap.dedent(msg), extra={"markup": True})
return []

Expand Down
45 changes: 6 additions & 39 deletions src/phylum/constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,14 @@
from phylum import __version__

# This is the minimum CLI version supported for new installs.
# Support for CycloneDX lockfiles was added in v5.7.0
MIN_CLI_VER_FOR_INSTALL = "v5.7.0"
# The `phylum status` command for printing project and lockfile details was added in v5.1.0
# but the fix to search for manifests' lockfiles in parent, rather than child directories was added in v5.6.0
MIN_CLI_VER_FOR_INSTALL = "v5.6.0"

# This is the minimum CLI version supported for existing installs.
# Support for CycloneDX lockfiles was added in v5.7.0
MIN_CLI_VER_INSTALLED = "v5.7.0"
# The `phylum status` command for printing project and lockfile details was added in v5.1.0
# but the fix to search for manifests' lockfiles in parent, rather than child directories was added in v5.6.0.
MIN_CLI_VER_INSTALLED = "v5.6.0"

# Keys are lowercase machine hardware names as returned from `uname -m`.
# Values are the mapped rustc architecture.
Expand All @@ -25,41 +27,6 @@
"darwin": "apple-darwin",
}

# These are the currently supported lockfiles.
# Keys are the standard lockfile filename, optionally specified with glob syntax.
# Values are the name of the tool that generates the lockfile.
SUPPORTED_LOCKFILES = {
# Javascript/Typescript
"package-lock.json": "npm",
"npm-shrinkwrap.json": "npm",
"yarn.lock": "yarn",
"pnpm-lock.yaml": "pnpm",
# Ruby
"Gemfile.lock": "gem",
# Python
"requirements*.txt": "pip",
"Pipfile.lock": "pipenv",
"poetry.lock": "poetry",
# C#
"*.csproj": "msbuild",
"packages.lock.json": "nuget",
"packages.*.lock.json": "nuget",
# Java
"effective-pom.xml": "mvn",
"gradle.lockfile": "gradle",
# Go
"go.sum": "go",
# Rust
"Cargo.lock": "cargo",
# SBOM
"*.spdx.json": "spdx",
"*.spdx.yaml": "spdx",
"*.spdx.yml": "spdx",
"*.spdx": "spdx",
"*bom.json": "cyclonedx",
"*bom.xml": "cyclonedx",
}

# Timeout value, in seconds, to tell the Python Requests package to stop waiting for a response.
# Reference: https://requests.readthedocs.io/en/latest/user/quickstart/#timeouts
REQ_TIMEOUT: float = 10.0
Expand Down

0 comments on commit bef93fa

Please sign in to comment.