Skip to content

Commit

Permalink
Improve postinstall script resilience (#69)
Browse files Browse the repository at this point in the history
* Allow postinstall scripts to be executed with
  any Python interpreter (not just the deployed
  base runtime interpreter)
* Generate layer config file as part of layers
* Use a common postinstall script in all layers
* Generate the deployed `sitecustomize.py` file
  from the layer config in the postinstall script
* Add unit tests for the common postinstall script
* Add build env creation test cases (separate from
  the slow lock-and-publish/export test cases)

Closes #66. Implements initial steps towards #19.
  • Loading branch information
ncoghlan authored Nov 7, 2024
1 parent bd6e88e commit b960f32
Show file tree
Hide file tree
Showing 36 changed files with 700 additions and 290 deletions.
3 changes: 3 additions & 0 deletions .github/workflows/update-expected-output.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,9 @@ on:
paths:
# Run for changes to *this* workflow file, but not for other workflows
- ".github/workflows/update-expected-output.yml"
# Check PRs that update the files injected into deployed environments
# (the layer config metadata format is also specified in these files)
- "src/venvstacks/_injected/**/*.py"
# Check PRs that update the expected test suite output results
- "tests/expected-output-config.toml"
- "tests/sample_project/venvstacks.toml"
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
Fixed
-----

- Post-installation scripts for layered environments now work
correctly even when run with a Python installation other
than the expected base runtime (resolved in :issue:`66`)

11 changes: 11 additions & 0 deletions src/venvstacks/_injected/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
Files injected into deployed environments
=========================================

Files in this folder are injected into the deployed
environments when publishing artifacts or locally
exporting environments.

They are also designed to be importable so that
the build process can access their functionality
without needing to duplicate the implementation,
and to make them more amenable to unit testing.
180 changes: 180 additions & 0 deletions src/venvstacks/_injected/postinstall.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,180 @@
"""venvstacks layer post-installation script
* Loads `./share/venv/metadata/venvstacks_layer.json`
* Generates `pyvenv.cfg` for layered environments
* Generates `sitecustomize.py` for layered environments
* Precompiles all Python files in the library folder
This post-installation script is automatically injected when packing environments.
"""

import json
import os

from compileall import compile_dir
from os.path import abspath
from pathlib import Path
from typing import cast, NotRequired, Sequence, TypedDict

DEPLOYED_LAYER_CONFIG = "share/venv/metadata/venvstacks_layer.json"


class LayerConfig(TypedDict):
"""Additional details needed to fully configure deployed environments"""

# fmt: off
python: str # Relative path to this layer's Python executable
py_version: str # Expected X.Y.Z Python version for this environment
base_python: str # Relative path from layer dir to base Python executable
site_dir: str # Relative path to site-packages within this layer
pylib_dirs: Sequence[str] # Relative paths to additional sys.path entries
dynlib_dirs: Sequence[str] # Relative paths to additional Windows DLL directories
launch_module: NotRequired[str] # Module to run with `-m` to launch the application
# fmt: on

# All relative paths are relative to the layer folder (and may refer to peer folders)
# Base runtime layers will have "python" and "base_python" set to the same value
# Application layers will have "launch_module" set


class ResolvedLayerConfig(TypedDict):
"""LayerConfig with relative paths resolved for a specific layer location"""

# fmt: off
layer_path: Path # Absolute path to layer environment
python_path: Path # Absolute path to this layer's Python executable
py_version: str # Expected X.Y.Z Python version for this environment
base_python_path: Path # Absolute path from layer dir to base Python executable
site_path: Path # Absolute path to site-packages within this layer
pylib_paths: Sequence[Path] # Absolute paths to additional sys.path entries
dynlib_paths: Sequence[Path] # Absolute paths to additional Windows DLL directories
launch_module: str|None # Module to run with `-m` to launch the application
# fmt: on


def load_layer_config(layer_path: Path) -> ResolvedLayerConfig:
"""Read and resolve config for the specified layer environment"""

def deployed_path(relative_path: str) -> Path:
"""Normalize path and make it absolute, *without* resolving symlinks"""
return Path(abspath(layer_path / relative_path))

config_path = layer_path / DEPLOYED_LAYER_CONFIG
config_text = config_path.read_text(encoding="utf-8")
# Tolerate runtime errors for incorrectly generated config files
config = cast(LayerConfig, json.loads(config_text))
return ResolvedLayerConfig(
layer_path=layer_path,
python_path=deployed_path(config["python"]),
py_version=config["py_version"],
base_python_path=deployed_path(config["base_python"]),
site_path=deployed_path(config["site_dir"]),
pylib_paths=[deployed_path(d) for d in config["pylib_dirs"]],
dynlib_paths=[deployed_path(d) for d in config["dynlib_dirs"]],
launch_module=config.get("launch_module", None),
)


def generate_pyvenv_cfg(base_python_path: Path, py_version: str) -> str:
"""Generate `pyvenv.cfg` contents for given base Python path and version"""
if not base_python_path.is_absolute():
raise RuntimeError("Post-installation must use absolute environment paths")
venv_config_lines = [
f"home = {base_python_path.parent}",
"include-system-site-packages = false",
f"version = {py_version}",
f"executable = {base_python_path}",
"",
]
return "\n".join(venv_config_lines)


_SITE_CUSTOMIZE_HEADER = '''\
"""venvstacks layered environment site customization script
* Calls `site.addsitedir` for any configured Python path entries
* Calls `os.add_dll_directory` for any configured Windows dynlib paths
This venv module is automatically generated by the post-installation script.
"""
'''


def generate_sitecustomize(
pylib_paths: Sequence[Path],
dynlib_paths: Sequence[Path],
*,
skip_missing_dynlib_paths: bool = True,
) -> str | None:
"""Generate `sitecustomize.py` contents for given linked environment directories"""
sc_contents = [_SITE_CUSTOMIZE_HEADER]
if pylib_paths:
pylib_contents = [
"# Allow loading modules and packages from linked environments",
"from site import addsitedir",
]
for path_entry in pylib_paths:
if not path_entry.is_absolute():
raise RuntimeError(
"Post-installation must use absolute environment paths"
)
pylib_contents.append(f"addsitedir({str(path_entry)!r})")
pylib_contents.append("")
sc_contents.extend(pylib_contents)
if dynlib_paths and hasattr(os, "add_dll_directory"):
dynlib_contents = [
"# Allow loading misplaced DLLs on Windows",
"from os import add_dll_directory",
]
for dynlib_path in dynlib_paths:
if not dynlib_path.is_absolute():
raise RuntimeError(
"Post-installation must use absolute environment paths"
)
if skip_missing_dynlib_paths and not dynlib_path.exists():
# Nothing added DLLs to this folder at build time, so skip it
# (add_dll_directory fails if the specified folder doesn't exist)
dynlib_entry = f"# Skipping {str(dynlib_path)!r} (no such directory)"
else:
dynlib_entry = f"add_dll_directory({str(dynlib_path)!r})"
dynlib_contents.append(dynlib_entry)
dynlib_contents.append("")
sc_contents.extend(dynlib_contents)
if len(sc_contents) == 1:
# Environment layer doesn't actually need customizing
return None
return "\n".join(sc_contents)


def _run_postinstall(layer_path: Path) -> None:
"""Run the required post-installation steps in a deployed environment"""

# Read the layer config file
config = load_layer_config(layer_path)

base_python_path = config["base_python_path"]
if base_python_path != config["python_path"]:
# Generate `pyvenv.cfg` for layered environments
venv_config = generate_pyvenv_cfg(base_python_path, config["py_version"])
venv_config_path = layer_path / "pyvenv.cfg"
venv_config_path.write_text(venv_config, encoding="utf-8")

# Generate `sitecustomize.py` for layered environments
sc_contents = generate_sitecustomize(
config["pylib_paths"], config["dynlib_paths"]
)
if sc_contents is not None:
sc_path = config["site_path"] / "sitecustomize.py"
sc_path.write_text(sc_contents, encoding="utf-8")

# Precompile Python library modules
pylib_path = (
layer_path / "lib"
) # "Lib" on Windows, but Windows is not case sensitive
compile_dir(pylib_path, optimize=0, quiet=True)


if __name__ == "__main__":
# Actually executing the post-installation step in a deployed environment
_run_postinstall(Path(__file__).parent)
4 changes: 4 additions & 0 deletions src/venvstacks/_util.py
Original file line number Diff line number Diff line change
Expand Up @@ -125,3 +125,7 @@ def run_python_command(
result = run_python_command_unchecked(command, text=True, **kwds)
result.check_returncode()
return result


def capture_python_output(command: list[str]) -> subprocess.CompletedProcess[str]:
return run_python_command(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
80 changes: 18 additions & 62 deletions src/venvstacks/pack_venv.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,61 +43,9 @@
from pathlib import Path
from typing import cast, Any, Callable, TextIO

from ._injected import postinstall as _default_postinstall
from ._util import as_normalized_path, StrPath, WINDOWS_BUILD as _WINDOWS_BUILD

_PRECOMPILATION_COMMANDS = """\
# Precompile Python library modules
from compileall import compile_dir
venv_pylib_path = venv_path / "lib" # "Lib" on Windows, but Windows is not case sensitive
compile_dir(venv_pylib_path, optimize=0, quiet=True)
"""

_BASE_RUNTIME_POST_INSTALL_SCRIPT = (
'''\
"""Base runtime post-installation script
* Precompiles all Python files in the library folder
This post-installation script is automatically injected when packing environments that
do NOT include a `pyvenv.cfg` file (i.e. base runtime environments)
"""
from pathlib import Path
venv_path = Path(__file__).parent
'''
+ _PRECOMPILATION_COMMANDS
)

_LAYERED_ENV_POST_INSTALL_SCRIPT = (
'''\
"""Layered environment post-installation script
* Generates pyvenv.cfg based on the Python runtime executing this script
* Precompiles all Python files in the library folder
This post-installation script is automatically injected when packing environments that
would otherwise include a `pyvenv.cfg` file (as `pyvenv.cfg` files are not relocatable)
"""
from pathlib import Path
venv_path = Path(__file__).parent
# Generate `pyvenv.cfg` based on the deployed runtime location
import sys
venv_config_path = venv_path / "pyvenv.cfg"
runtime_executable_path = Path(sys.executable).resolve()
runtime_version = ".".join(map(str, sys.version_info[:3]))
venv_config = f"""\
home = {runtime_executable_path.parent}
include-system-site-packages = false
version = {runtime_version}
executable = {runtime_executable_path}
"""
venv_config_path.write_text(venv_config, encoding="utf-8")
'''
+ _PRECOMPILATION_COMMANDS
)

SymlinkInfo = tuple[Path, Path]


Expand Down Expand Up @@ -165,18 +113,20 @@ def get_archive_path(archive_base_name: StrPath) -> Path:


def _inject_postinstall_script(
env_path: Path, script_name: str = "postinstall.py"
env_path: Path,
script_name: str = "postinstall.py",
script_source: StrPath | None = None,
) -> Path:
venv_config_path = env_path / "pyvenv.cfg"
if venv_config_path.exists():
# The venv config contains absolute paths referencing the base runtime environment
# Remove it here, let the post-install script recreate it
venv_config_path.unlink()
script_contents = _LAYERED_ENV_POST_INSTALL_SCRIPT
else:
script_contents = _BASE_RUNTIME_POST_INSTALL_SCRIPT
if script_source is None:
# Nothing specified, inject the default postinstall script
script_source = _default_postinstall.__file__
script_path = env_path / script_name
script_path.write_text(script_contents, encoding="utf-8")
shutil.copy2(script_source, script_path)
return script_path


Expand All @@ -201,7 +151,9 @@ def export_venv(
"""Export the given build environment, skipping archive creation and unpacking
* injects a suitable `postinstall.py` script for the environment being exported
* excludes __pycache__ folders and package metadata RECORD files
* excludes __pycache__ folders (for consistency with archive publication)
* excludes package metadata RECORD files (for consistency with archive publication)
* excludes `sitecustomize.py` files (generated by the post-installation script)
* replaces symlinks with copies on Windows or if the target doesn't support symlinks
If supplied, *run_postinstall* is called with the path to the environment's Python
Expand All @@ -213,7 +165,7 @@ def export_venv(
"""
source_path = as_normalized_path(source_dir)
target_path = as_normalized_path(target_dir)
excluded = shutil.ignore_patterns("__pycache__", "RECORD")
excluded = shutil.ignore_patterns("__pycache__", "RECORD", "sitecustomize.py")
# Avoid symlinks on Windows, as they need elevated privileges to create
# Also avoid them if the target folder doesn't support symlink creation
# (that way exports to FAT/FAT32/VFAT file systems should work, even if
Expand Down Expand Up @@ -247,9 +199,13 @@ def create_archive(
* injects a suitable `postinstall.py` script for the environment being archived
* always creates zipfile archives on Windows and xztar archives elsewhere
* excludes __pycache__ folders and package metadata RECORD files
* excludes __pycache__ folders (to reduce archive size and improve reproducibility)
* excludes package metadata RECORD files (to improve reproducibility)
* excludes `sitecustomize.py` files (generated by the post-installation script)
* replaces symlinks with copies on Windows and allows external symlinks elsewhere
* discards owner and group information for tar archives
* discards tar entry owner and group information
* clears tar entry high mode bits (setuid, setgid, sticky)
* clears tar entry group/other write mode bits
* clamps mtime of archived files to the given clamp mtime at the latest
* shows progress reporting by default (archiving built ML/AI libs is *slooooow*)
Expand Down
Loading

0 comments on commit b960f32

Please sign in to comment.