Skip to content

Commit

Permalink
docs: tidy up dev docs (#1080)
Browse files Browse the repository at this point in the history
  • Loading branch information
jnussbaum authored Aug 7, 2024
1 parent 231a6b7 commit eeaccc6
Show file tree
Hide file tree
Showing 6 changed files with 32 additions and 161 deletions.
79 changes: 0 additions & 79 deletions docs/developers/github-actions.md

This file was deleted.

59 changes: 20 additions & 39 deletions docs/developers/packaging.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,52 +18,33 @@ these processes. The [Python Packaging User Guide](https://packaging.python.org)
DSP-TOOLS uses [poetry](https://python-poetry.org) for all of these tasks. This allows us to use one single tool
for all processes, and to keep the number of configuration files at a minimum.

There are many configuration and metadata files that can be found on the top level of a Python repository. The ones
used in the DSP-TOOLS repository are:

| File | Purpose |
| -------------- | ------------------------------------------------------------------------------ |
| README.md | Markdown-formatted info for developers |
| pyproject.toml | Modern configuration/metadata file replacing the deprecated files listed below |
| .gitignore | List of files not under version control (won't be uploaded to GitHub) |
| CHANGELOG.md | Markdown-formatted release notes (must not be edited by hand) |
| LICENSE | Text file with the license how to use the source code of DSP-TOOLS |
| poetry.lock | Pinned versions of all (sub-)dependencies, allows a deterministic installation |
| mkdocs.yml | Configuration of `mkdocs`, used to build the documentation webpages |

In earlier times, there were some more configuration files, but thanks to poetry, they are not necessary anymore:

| Deprecated file | Purpose | Replaced by |
| -------------------- | --------------------------------------------------- | ------------------------------------------------------ |
| MANIFEST.in | files to include into distribution | pyproject.toml: `[tool.poetry.include]` |
| setup.py | project metadata, dependencies | pyproject.toml |
| setup.cfg | configuration for setuptools | pyproject.toml |
| requirements.txt | all (sub-)dependencies | pyproject.toml: `[tool.poetry.dependencies]` |
| dev-requirements.txt | additional dependencies for development | pyproject.toml: `[tool.poetry.group.dev.dependencies]` |
| Pipfile | direct dependencies | pyproject.toml: `[tool.poetry.dependencies]` |
| Pipfile.lock | pinned dependencies | poetry.lock |
| Makefile | commands that can be executed with `make [command]` | pyproject.toml: `[tool.poetry-exec-plugin.commands]` |
Poetry needs only 2 config files:

- `pyproject.toml`: Manifest file, a modern configuration/metadata file replacing the deprecated files listed below
- `poetry.lock`: Lock file, containing the pinned versions of all (sub-)dependencies, allowing a deterministic installation


## Dependency Management

The classic way to manage the dependencies was to write the required packages by hand into a `requirements.txt` and
into a `setup.py` file.
into a `setup.py` file. But this is cumbersome and error-prone.
Moreover, `setup.py` is problematic and not recommended anymore, especially
[calling `setup.py sdist bdist_wheel`](https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.html#summary).
Python projects should define their dependencies and metadata in the modern `pyproject.toml` file.

But this is cumbersome and error-prone, so there was a time when [pipenv](https://pipenv.pypa.io/en/latest/) was the
way to go: Pipenv introduced the important distinction between (a) dependencies necessary to run the application,
(b) dependencies necessary for development, and (c) sub-dependencies, i.e. dependencies of your dependencies. Another
useful concept of pipenv is the distinction between a human-friendly list of (mostly unpinned) direct dependencies and
a machine-friendly definition of exact (pinned) versions of all dependencies.
But since pipenv has no packaging functionality, it was necessary to sync the dependency definitions from `Pipfile` to
`requirements.txt` and `setup.py`.
So it is necessary to dynamically manage the dependencies in `pyproject.toml`.
And poetry seems to be the only tool capable of doing this.

`setup.py`, too, is problematic, especially
[calling `setup.py sdist bdist_wheel`](https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.html#summary).
Python projects should define their dependencies and metadata in the modern `pyproject.toml` file. So it is
necessary to dynamically manage the dependencies in `pyproject.toml`. And poetry seems to be the only tool capable
of doing this.
Poetry is one of the few tools that cleanly distinguishes

- dependencies necessary to run the application,
- dependencies necessary for development, and
- sub-dependencies, i.e. dependencies of your dependencies.

It is also one of the few tools that makes the distinction between

- the manifest file, i.e. a human-friendly list of (mostly unpinned) direct dependencies and
- the lock file, i.e. a machine-friendly definition of exact (pinned) versions of all dependencies.


## Packaging
Expand Down Expand Up @@ -93,7 +74,7 @@ package. Since `site-packages` is on `sys.path`, the user can then import the pa
Putting all packages into a `src` folder has an important consequence: It forces the developer to work with an
editable installation of his package. Why? Without an editable installation, it is impossible to write correct import
statements. `from src.package import module` will not work, because the user has `package` installed, not `src`. And
relative imports like `import module` will not work either, because when the tests code (situated in a separate
relative imports like `import module` will not work either, because when the test code (situated in a separate
`test` folder) imports the actual code, the relative imports in the actual code fail. This is because relative imports
depend on the location of the file that is run, not on the file that contains the import statement.

Expand Down
23 changes: 0 additions & 23 deletions docs/developers/start-stack-command.md

This file was deleted.

23 changes: 11 additions & 12 deletions docs/developers/user-data.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,17 +6,16 @@ DSP-TOOLS saves user data in the user's home directory,
in the folder `.dsp-tools`.
Here is an overview of its structure:

| file/folder | command using it | description |
| :------------------------- | :--------------- | :----------------------------------------------------------------------------------------- |
| xmluploads | `xmlupload` | saves id2iri mappings and error reports |
| docker | `start-stack` | files necessary to startup Docker containers |
| rosetta | `rosetta` | a clone of [the rosetta test project](https://github.com/dasch-swiss/082e-rosetta-scripts) |
| logging.log, logging.log.1 | several ones | These two grow up to 3 MB, then the oldest entries are deleted |
| file/folder | command using it | description |
| :---------------------------------- | :--------------- | :----------------------------------------------------------------------------------------- |
| xmluploads/(server)/resumable/*.pkl | `xmlupload` | Upload state of interrupted xmluploads |
| start-stack | `start-stack` | files necessary to startup Docker containers (*) |
| rosetta | `rosetta` | a clone of [the rosetta test project](https://github.com/dasch-swiss/082e-rosetta-scripts) |
| logging.log, logging.log.1, ... | several ones | These two grow up to a predefined size, then the oldest log files are deleted |


Remark: Docker is normally not able to access files
stored in the `site-packages` of a Python installation.
Therefore, it's necessary to copy the "docker" folder
(*) Docker is normally not able to access files stored in the `site-packages` of a Python installation.
Therefore, it's necessary to copy the distributed `src/dsp_tools/resources/start-stack/` folder
to the user's home directory.


Expand All @@ -27,7 +26,7 @@ Accessing non-Python files (aka resources, aka data files)
in the code needs special attention.

Firstly, the build tool must be told to include this folder/files in the distribution.
In our case, this happens in `[tool.poetry.include]` in the `pyproject.toml` file.
In our case, this happens in `[tool.poetry] > include` in the `pyproject.toml` file.

Secondly, when accessing the files on the customer's machine,
the files inside `site-packages` should be read-only
Expand Down Expand Up @@ -69,7 +68,7 @@ DSP-TOOLS is called from -
not the directory where the distribution files are situated in.

To circumvent this problem,
it was once common to manipulate a packages `__file__` attribute
it was once common to manipulate a package's `__file__` attribute
in order to find the location of data files:

```python
Expand All @@ -79,7 +78,7 @@ with open(data_path) as data_file:
...
```

However, this manipulation isnt compatible with PEP 302-based import hooks,
However, this manipulation isn't compatible with PEP 302-based import hooks,
including importing from zip files and Python Eggs.

**The canonical way is to use [importlib.resources](https://docs.python.org/3/library/importlib.resources.html):**
Expand Down
2 changes: 0 additions & 2 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,9 +23,7 @@ nav:
- Information for developers:
- Developers documentation: developers/index.md
- Dependencies, packaging & distribution: developers/packaging.md
- Maintaining the start-stack command: developers/start-stack-command.md
- User data: developers/user-data.md
- GitHub actions: developers/github-actions.md
- MkDocs and markdown-link-validator: developers/mkdocs.md
- Code quality tools:
- Overview: developers/code-quality-tools/code-quality-tools.md
Expand Down
7 changes: 1 addition & 6 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -79,12 +79,7 @@ dsp-tools = "dsp_tools.cli.entry_point:main" # definition of the CLI entry poin
[tool.poetry-exec-plugin.commands]
# plugin (https://github.com/keattang/poetry-exec-plugin) to define commands available for the developers,
# e.g. `poetry exec check-links`
check-links = """
markdown-link-validator ./docs \
-i \\.\\/assets\\/.+ \
-i .+github\\.com\\/dasch\\-swiss\\/dsp-tools\\/settings \
-i .+github\\.com\\/dasch\\-swiss\\/ops-deploy\\/.+\
"""
check-links = "markdown-link-validator ./docs -i \\.\\/assets\\/.+"
darglint = """
find . -name "*.py" \
-not -path "./src/dsp_tools/commands/project/models/*" \
Expand Down

0 comments on commit eeaccc6

Please sign in to comment.