Skip to content

Commit

Permalink
Merge branch 'develop' into feature/init-command
Browse files Browse the repository at this point in the history
  • Loading branch information
nheeb committed Sep 3, 2024
2 parents bddbb27 + 0e73ac9 commit ec62b74
Show file tree
Hide file tree
Showing 19 changed files with 381 additions and 111 deletions.
31 changes: 0 additions & 31 deletions .github/workflows/hermes-zenodo-sandbox.yml

This file was deleted.

151 changes: 151 additions & 0 deletions .github/workflows/hermes-zenodo.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,151 @@
# SPDX-FileCopyrightText: 2023 German Aerospace Center (DLR), Helmholtz-Zentrum Drseden-Rossendorf, Forschungszentrum Jülich
#
# SPDX-License-Identifier: CC0-1.0

# SPDX-FileContributor: Stephan Druskat

name: Software publication on Zenodo

on:
push:
tags:
- "*"

# NOTE: Do not delete the trigger on closed pull requests, the HERMES workflow needs this.
pull_request:
types:
- closed

jobs:
hermes-prepare:
name: Prepare Metadata for Curation
runs-on: ubuntu-latest
# This condition becomes much easier when we only react to push events on the release branch.
# We still need to exclude the merge commit push of the post processing PR

# ADAPT
# Depending on the event you react to in the 'on:' section above, you will need to adapt this
# to react on the specific events.
# NOTE: You will probably still need to keep the exclusion check for commit messages provided by the workflow ('hermes/'/'hermes/post').
if: >
github.event_name == 'push' && ! (
startsWith(github.ref_name, 'hermes/') ||
contains(github.event.head_commit.message, 'hermes/post')
)
permissions:
contents: write # Allow creation of new branches
pull-requests: write # Postprocessing should be able to create a pull request with changes

steps:
- uses: actions/checkout@v3
- uses: actions/setup-python@v4
with:
python-version: '3.10'
- run: pip install hermes hermes-plugin-python
- run: hermes harvest
- run: hermes process
- run: hermes curate

- run: |
# Cache current branch for PR close job
git branch --show-current > .hermes/curate/target_branch
# Shorten the SHA for the PR title
SHORT_SHA=$(echo "$GITHUB_SHA" | cut -c -8)
echo "SHORT_SHA=$SHORT_SHA" >> "$GITHUB_ENV"
# Create a curation branch
git switch -c "hermes/curate-$SHORT_SHA"
git push origin "hermes/curate-$SHORT_SHA"
# Explicitly add to-be-curated metadata (which is ignored via .gitignore!)
git add -f .hermes/curate
- uses: peter-evans/create-pull-request@v5
with:
base: hermes/curate-${{ env.SHORT_SHA }}
branch: hermes/curate-result-${{ env.SHORT_SHA }}
title: Metadata Curation for Commit ${{ env.SHORT_SHA }}
body: |
Please carefully review the attached metadata.
If you are satisfied with the result, you may merge this PR, which will trigger publication.
(Any temporary branches will be cleaned up.)
delete-branch: true

hermes-curate:
name: Publish Software with Curated Metadata
if: github.event.pull_request.merged == true && startsWith( github.base_ref , 'hermes/curate-')

runs-on: ubuntu-latest
permissions:
contents: write # Allow creation of new branches
pull-requests: write # Postprocessing should be able to create a pull request with changes

steps:
- uses: actions/checkout@v3
- uses: actions/setup-python@v4
with:
python-version: '3.10'
- run: pip install hermes

# ADAPT
# If you want to publish artifacts (e.g., a zipped snapshot of your repository),
# you can prepare this here.
- run: git archive --format zip HEAD src > hermes.zip

# Run the HERMES deposition and postprocessing steps.
# ADAPT
# 1. You need to have an authentication token for your target publication platform
# as a GitHub secret in your repository (in this example, this is called ZENODO_SANDBOX).
# 2. Adapt the files you want to deposit. In the example, showcase.zip and README.md are deposited alongside the metadata.
# 3. Check if you want to run with '--initial', as this may potentially create a completely new record (collection),
# rather than a new version of the same collection!
- run: hermes deposit --initial -O invenio_rdm.auth_token ${{ secrets.ZENODO }} --file hermes.zip --file README.md

# ADAPT
# Remove this command if you don't want to do any postprocessing
- run: hermes postprocess

# ADAPT
# If you don't want to run postprocessing, remove this complete section (next '-run' and 'uses: peter-evans/...' bullets).
#
# Note 1: We change the base branch here for the PR. This flow runs so far within the "curated-metadata-*" branch,
# but now we want to add the changes done by deposit and post processing to the branch that was initially
# meant to be published using HERMES.
# Note 2: The create-pull-request action will NOT inherit the commits we did in the previous job. It will only look at the
# changes within this local workspace we did *now*.
- run: echo "TARGET_BRANCH=$(cat .hermes/curate/target_branch)" >> "$GITHUB_ENV"
- uses: peter-evans/create-pull-request@v5
with:
branch: hermes/post-${{ github.run_id }}
base: ${{ env.TARGET_BRANCH }}
title: Review hermes post-processing results
body: |
This is an automated pull request created by HERMES post-processing.
Please carefully review the changes and finally merge them into your
# Delete all the curation branches
- run: |
for BRANCH in $(git ls-remote origin 'refs/heads/hermes/curate-*' | cut -f2 | cut -d'/' -f'3-'); do
git push origin --delete "$BRANCH"
done
# TODO: if: failure() --- delete the curation branches when the deposition failed


hermes-cleanup:
name: Cleanup aborted curation branches
if: github.event.pull_request.merged == false && startsWith( github.base_ref , 'hermes/curate-')

runs-on: ubuntu-latest
permissions:
contents: write # Allow creation of new branches
pull-requests: write # Postprocessing should be able to create a pull request with changes

steps:
- uses: actions/checkout@v3
# Delete all the curation branches
- run: |
for BRANCH in $(git ls-remote origin 'refs/heads/hermes/curate-*' | cut -f2 | cut -d'/' -f'3-'); do
git push origin --delete "$BRANCH"
done
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -23,3 +23,4 @@ dist/

# HERMES workflow specifics
.hermes/
hermes.log
13 changes: 11 additions & 2 deletions CITATION.cff
Original file line number Diff line number Diff line change
Expand Up @@ -14,9 +14,9 @@ title: hermes
message: >-
If you use this software, please cite it using the
metadata from this file.
version: "proof-of-concept"
version: 0.8.1
license: "Apache-2.0"
abstract: "Proof-of-concept implementation of the HERMES workflow."
abstract: "Tool to automate software publication. Not stable yet."
type: software
authors:
- given-names: Michael
Expand Down Expand Up @@ -49,3 +49,12 @@ authors:
email: [email protected]
affiliation: Helmholtz-Zentrum Dresden-Rossendorf (HZDR)
orcid: 'https://orcid.org/0000-0002-3145-9880'
- given-names: Kernchen
family-names: Sophie
email: [email protected]
affiliation: German Aerospace Center (DLR)
orcid: 'https://orcid.org/0009-0005-4430-6743'
identifiers:
- type: doi
value: 10.5281/zenodo.13221384
description: Version 0.8.1b1
158 changes: 158 additions & 0 deletions docs/source/tutorials/writing-a-plugin-for-hermes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,158 @@
<!--
SPDX-FileCopyrightText: 2024 German Aerospace Center (DLR)
SPDX-License-Identifier: CC-BY-SA-4.0
-->

<!--
SPDX-FileContributor: Michael Meinel
SPDX-FileContributor: Sophie Kernchen
-->

# Write a plugin for HERMES


This tutorial will present the basic steps for writing an additional harvester.
At the moment only the architecture for harvester plugins is stable.
The full code and structure is available at [hermes-plugin-git](https://github.com/softwarepub/hermes-plugin-git).
This plugin extracts information from the local git history.
The hermes-plugin-git will help to gather contributing and branch metadata.
```{note}
For this tutorial you should be familiar with HERMES.
If you never used HERMES before, you might want to check the tutorial: [Automated Publication with HERMES](https://docs.software-metadata.pub/en/latest/tutorials/automated-publication-with-ci.html).
```

## Plugin Architecture

HERMES uses a plugin architecture. Therefore, users are invited to contribute own features.
The structure for every plugin follows the same schema.
There is a top-level base class for every plugin. In this `HermesPlugin` class there is one abstract method `__ call __` which needs to be overwritten.
Furthermore, the `HermesCommand` class provides all needs for writing a plugin used in a HERMES command.
So the `HermesPlugin`s call method gets an instance of the `HermesCommand` that triggered this plugin to run.
In our case this will be the `HermesHarvestCommand` which calls all harvest plugins.
The plugin class also uses a derivative of `HermesSettings` to add parameters that can be adapted by the configuration file.
`HermesSettings` are the base class for command specific settings.
It uses [pydantic](https://docs.pydantic.dev/latest/) [settings](https://docs.pydantic.dev/latest/api/pydantic_settings/) to specify and validate the parameters.
The user can either set the parameters in the `hermes.toml` or overwrite them in the command line.
To overwrite a parameter from command line, use the `-O` command line option followed by the dotted parameter name and the value.
E.g., you can set your authentication token for InvenioRDM by adding the following options to your call to `hermes deposit`:
```shell
hermes deposit -O invenio_rdm.auth_token YourSecretAuthToken

## Set Up Plugin
To write a new plugin, it is important to follow the given structure.
This means your plugins source code has a pydantic class with Settings and the plugin class which inherits from one base class.
For our specific case, we want to write a git harvest plugin.
Our class Structure should look like this:


```{code-block} python
from hermes.commands.harvest.base import HermesHarvestPlugin
from pydantic import BaseModel
class GitHarvestSettings(BaseModel):
from_branch: str = 'main'
class GitHarvestPlugin(HermesHarvestPlugin):
settings_class = GitHarvestSettings
def __call__(self, command):
print("Hello World!")
return {}, {}
```

The code uses the `HermesHarvestPlugin` as base class and pydantics Basemodel for the settings. In the `GitHarvestSettings` you
can see that an additional parameter is defined. The Parameter `from_branch` is specific for this plugin and can be accessed inside the plugin using `self.settings.harvest.git.git_branch` as long as our plugin will be named git.
In the `hermes.toml` this would be achieved by [harvest.{plugin_name}].
The `GitHarvestSettings` are associated with the `GitHarvestPlugin`. In the plugin you need to overwrite the `__ call __` method.
For now a simple Hello World will do. The method returns two dictionaries. These will contain the harvested data in CodeMeta (JSON-LD) and additional information, e.g., to provide provenance information.
That is the basic structure for the plugins source code.

To integrate this code, you have to register it as a plugin in the `pyproject.toml`. To learn more about the `pyproject.toml` check https://python-poetry.org/docs/pyproject/ or refer to [PEP621](https://peps.python.org/pep-0621/).
We will just look at the important places for this plugin. There are two ways to integrate this plugin. First we will show how to use the plugin environment as the running base with HERMES as a dependency.
Then we say how to integrate this plugin in HERMES itself.

### Include HERMES as Dependency
This is probably the more common way, where you can see HERMES as a framework.
The idea is that your project is the main part. You create the `pyproject.toml` as usual.
In the dependencies block you need to include `hermes`. Then you just have to declare your plugin.
The HERMES software will look for installed plugins and use them.
In the code below you can see the parts of the `pyproject.toml` that are important.
```{code-block} toml
...
[tool.poetry.dependencies]
python = "^3.10"
hermes = "^0.8.0"
...
...
[tool.poetry.plugins."hermes.harvest"]
git = "hermes_plugin_git.harvest:GitHarvestPlugin"
...
```
As you can see the plugin class from `hermes_plugin_git` is declared as `git` for the `hermes.harvest` entrypoint.
To use the plugin you have to adapt the harvest settings in the `hermes.toml`.
We will discuss the exact step after showing the other `pyproject.toml` configuration.
```{note}
You have to run poetry install to add and install all entrypoints declared in the pyproject.toml.
```

### Write Plugin to be included in HERMES
This variant is used to contribute to the HERMES community or adapt the HERMES workflow for own purposes.
If you want to contribute, see the [Contribution Guidelines](https://docs.software-metadata.pub/en/latest/dev/contribute.html).
After cloning the HERMES workflow repository you can adapt the pyproject.toml.
In the code below you see the parts with the important lines.
```{code-block} toml
...
[tool.poetry.dependencies]
...
pydantic-settings = "^2.1.0"
hermes-plugin-git = { git = "https://github.com/softwarepub/hermes-plugin-git.git", branch = "main" }
...
...
[tool.poetry.plugins."hermes.harvest"]
cff = "hermes.commands.harvest.cff:CffHarvestPlugin"
codemeta = "hermes.commands.harvest.codemeta:CodeMetaHarvestPlugin"
git = "hermes_plugin_git.harvest:GitHarvestPlugin"
...
```
In the dependencies you have to install your plugin. If your Plugin is pip installable than you can just give the name and the version.
If your plugin is in a buildable git repository, you can install it with the given expression.
Note that this differs with the accessibility and your wishes, check [Explicit Package Sources](https://python-poetry.org/docs/repositories/#explicit-package-sources).
The second thing to adapt is to declare the access point for the plugin.
You can do that with `git = "hermes_plugin_git.harvest:GitHarvestPlugin"`.
This expression makes the GitHarvestPlugin from the hermes_plugin_git package, a hermes.harvest plugin named git.
So you need to configure this line with your plugin properties.
Now you just need to add the plugin to the hermes.toml and reinstall the adapted poetry package.
### Configure hermes.toml
To use the plugin, you have to activate it in the `hermes.toml`.
The settings for the plugins are also set there.
For the harvest plugin the `hermes.toml` could look like this:
```{code-block} toml
[harvest]
sources = [ "cff", "git" ] # ordered priority (first one is most important)
[harvest.cff]
enable_validation = false
[harvest.git]
from_branch = "develop"
...
```
In the `[harvest]` section you define that this plugin is used with less priority than the built-in `cff` plugin.
in the `[harvest.git]` section you set the configuration for the plugin.
In the beginning of this tutorial we set the parameter `from_branch` in the git settings. Now we change the default `from_branch` to `develop`.
With this configuration the plugin will be used. If you run `hermes harvest`, you should see the "Hello World" message.
```{admonition} Congratulations!
You can now write plugins for HERMES.
```
To fill the plugin with code, you can check our [hermes-plugin-git](https://github.com/softwarepub/hermes-plugin-git) repository.
There is the code to check the local git history and extract contributors of the given branch.
If you have any questions, wishes or requests, feel free to contact us.
8 changes: 3 additions & 5 deletions hermes.toml
Original file line number Diff line number Diff line change
Expand Up @@ -3,18 +3,16 @@
# SPDX-License-Identifier: CC0-1.0

[harvest]
sources = [ "cff" ] # ordered priority (first one is most important)

[harvest.cff]
enable_validation = false
sources = [ "cff", "toml" ] # ordered priority (first one is most important)

[deposit]
target = "invenio_rdm"

[deposit.invenio_rdm]
site_url = "https://sandbox.zenodo.org"
site_url = "https://zenodo.org"
communities = []
access_right = "open"
record_id = 13221384

[deposit.invenio_rdm.api_paths]
depositions = "api/deposit/depositions"
Expand Down
Loading

0 comments on commit ec62b74

Please sign in to comment.