Skip to content

Commit

Permalink
Merge pull request #327 from BU-ISCIII/develop
Browse files Browse the repository at this point in the history
Release 1.2.0
  • Loading branch information
Shettland authored Oct 11, 2024
2 parents 9ec59c7 + 6ef3a10 commit 9acb6cf
Show file tree
Hide file tree
Showing 26 changed files with 1,190 additions and 426 deletions.
39 changes: 0 additions & 39 deletions .github/workflows/pypi_publish.yml
Original file line number Diff line number Diff line change
Expand Up @@ -47,42 +47,3 @@ jobs:
path: dist/
- name: Publish to PyPI
uses: pypa/gh-action-pypi-publish@release/v1

github-release:
name: Sign dist with Sigstore and upload to GitHub Release
needs:
- publish-to-pypi
runs-on: ubuntu-latest
permissions:
contents: write
id-token: write
steps:
- name: Download all the dists
uses: actions/download-artifact@v4
with:
name: python-package-distributions
path: dist/
- name: Sign the dists with Sigstore
uses: sigstore/[email protected]
with:
inputs: >-
./dist/*.tar.gz
./dist/*.whl
- name: Create GitHub Release
env:
GITHUB_TOKEN: ${{ github.token }}
run: >-
gh release create
'${{ github.ref_name }}'
--repo '${{ github.repository }}'
--notes ""
- name: Upload artifact signatures to GitHub Release
env:
GITHUB_TOKEN: ${{ github.token }}
# Upload to GitHub Release using the `gh` CLI.
# `dist/` contains the built packages, and the
# sigstore-produced signatures and certificates.
run: >-
gh release upload
'${{ github.ref_name }}' dist/**
--repo '${{ github.repository }}'
21 changes: 21 additions & 0 deletions .github/workflows/python_lint.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,19 @@ jobs:
uses: actions/checkout@master
- name: Install flake8
run: pip install flake8
- name: Check for Python file changes
id: file_check
uses: tj-actions/changed-files@v44
with:
sha: ${{ github.event.pull_request.head.sha }}
files: |
**.py
- name: Run flake8
if: steps.file_check.outputs.any_changed == 'true'
run: flake8 --ignore E501,W503,E203,W605
- name: No Python files changed
if: steps.file_check.outputs.any_changed != 'true'
run: echo "No Python files have been changed."

black_lint:
runs-on: ubuntu-latest
Expand All @@ -31,5 +42,15 @@ jobs:
uses: actions/checkout@v2
- name: Install black in jupyter
run: pip install black[jupyter]
- name: Check for Python file changes
id: file_check
uses: tj-actions/changed-files@v44
with:
sha: ${{ github.event.pull_request.head.sha }}
files: '**.py'
- name: Check code lints with Black
if: steps.file_check.outputs.any_changed == 'true'
uses: psf/black@stable
- name: No Python files changed
if: steps.file_check.outputs.any_changed != 'true'
run: echo "No Python files have been changed."
23 changes: 7 additions & 16 deletions .github/workflows/test_sftp_handle.yml
Original file line number Diff line number Diff line change
@@ -1,12 +1,14 @@
name: test_sftp_handle

on:
push:
branches: "**"
pull_request_target:
types: [opened, reopened, synchronize, closed]
types: [opened, reopened, synchronize]
branches: "**"


concurrency:
group: ${{ github.repository }}-test_sftp_handle
cancel-in-progress: false

jobs:
security_check:
runs-on: ubuntu-latest
Expand All @@ -24,21 +26,10 @@ jobs:
echo "Current permission level is ${{ steps.checkAccess.outputs.user-permission }}"
echo "Job originally triggered by ${{ github.actor }}"
exit 1
sleep_to_ensure_concurrency:
needs: security_check
runs-on: ubuntu-latest
steps:
- name:
run: sleep 10s
shell: bash
test_sftp_handle:
needs: [security_check, sleep_to_ensure_concurrency]
needs: security_check
if: github.repository_owner == 'BU-ISCIII'
concurrency:
group: ${{ github.repository }}-test_sftp_handle
cancel-in-progress: false
runs-on: ubuntu-latest
strategy:
max-parallel: 1
Expand Down
48 changes: 45 additions & 3 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,29 +4,71 @@ All notable changes to this project will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [1.X.Xdev] - 2024-XX-XX : https://github.com/BU-ISCIII/relecov-tools/releases/tag/1.X.X
## [1.2.0] - 2024-10-11 : https://github.com/BU-ISCIII/relecov-tools/releases/tag/1.2.0

### Credits

Code contributions to the hotfix:
Code contributions to the release:

- [Juan Ledesma](https://github.com/juanledesma78)
- [Pablo Mata](https://github.com/Shettland)
- [Sergio Olmos](https://github.com/OPSergio)

### Modules

- Included wrapper module to launch download, read-lab-metadata and validate processes sequentially [#322](https://github.com/BU-ISCIII/relecov-tools/pull/322)
- Changed launch-pipeline name for pipeline-manager when tools are used via CLI [#324](https://github.com/BU-ISCIII/relecov-tools/pull/324)

#### Added enhancements

- Now also check for gzip file integrity after download. Moved cleaning process to end of workflow [#313](https://github.com/BU-ISCIII/relecov-tools/pull/313)
- Introduced a decorator in sftp_client.py to reconnect when conection is lost [#313](https://github.com/BU-ISCIII/relecov-tools/pull/313)
- Add Hospital Universitari Doctor Josep Trueta to laboratory_address.json [#316] (https://github.com/BU-ISCIII/relecov-tools/pull/316)
- samples_data json file is no longer mandatory as input in read-lab-metadata [#314](https://github.com/BU-ISCIII/relecov-tools/pull/314)
- Included handling of alternative column names to support two distinct headers using the same schema in read-lab-metadata [#314](https://github.com/BU-ISCIII/relecov-tools/pull/314)
- Included a new hospital (Hospital Universitario Araba) to laboratory_address.json [#315](https://github.com/BU-ISCIII/relecov-tools/pull/315)
- More accurate cleaning process, skipping only sequencing files instead of whole folder [#321](https://github.com/BU-ISCIII/relecov-tools/pull/321)
- Now single logs summaries are also created for each folder during download [#321](https://github.com/BU-ISCIII/relecov-tools/pull/321)
- Introduced handling for missing/dup files and more accurate information in prompt for pipeline_manager [#321](https://github.com/BU-ISCIII/relecov-tools/pull/321)
- Included excel resize, brackets removal in messages and handled exceptions in log_summary.py [#322](https://github.com/BU-ISCIII/relecov-tools/pull/322)
- Included processed batchs and samples in read-bioinfo-metadata log summary [#324](https://github.com/BU-ISCIII/relecov-tools/pull/324)
- When no samples_data.json is given, read-lab-metadata now creates a new one [#324](https://github.com/BU-ISCIII/relecov-tools/pull/324)
- Handling for missing sample ids in read-lab-metadata [#324](https://github.com/BU-ISCIII/relecov-tools/pull/324)
- Better logging for download, read-lab-metadata and wrapper [#324](https://github.com/BU-ISCIII/relecov-tools/pull/324)

#### Fixes

- Fixed wrong city name in relecov_tools/conf/laboratory_address.json [#320](https://github.com/BU-ISCIII/relecov-tools/pull/320)
- Fixed wrong single-paired layout detection in metadata due to Capital letters [#321](https://github.com/BU-ISCIII/relecov-tools/pull/321)
- Error handling in merge_logs() and create_logs_excel() methods for log_summary.py [#322](https://github.com/BU-ISCIII/relecov-tools/pull/322)
- Included handling of multiple empty rows in metadata xlsx file [#322](https://github.com/BU-ISCIII/relecov-tools/pull/322)

#### Changed

- Renamed and refactored "bioinfo_lab_heading" for "alt_header_equivalences" in configuration.json [#314](https://github.com/BU-ISCIII/relecov-tools/pull/314)
- Included a few schema fields that were missing or outdated, related to bioinformatics results [#314](https://github.com/BU-ISCIII/relecov-tools/pull/314)
- Updated metadata excel template, moved to relecov_tools/assets [#320](https://github.com/BU-ISCIII/relecov-tools/pull/320)
- Now python lint only triggers when PR includes python files [#320](https://github.com/BU-ISCIII/relecov-tools/pull/320)
- Moved concurrency to whole workflow instead of each step in test_sftp-handle.yml [#320](https://github.com/BU-ISCIII/relecov-tools/pull/320)
- Updated test_sftp-handle.yml testing datasets [#320](https://github.com/BU-ISCIII/relecov-tools/pull/320)
- Now download skips folders containing "invalid_samples" in its name [#321](https://github.com/BU-ISCIII/relecov-tools/pull/321)
- read-lab-metadata: Some warnings now include label. Also removed trailing spaces [#322](https://github.com/BU-ISCIII/relecov-tools/pull/322)
- Renamed launch-pipeline for pipeline-manager and updated keys in configuration.json [#324](https://github.com/BU-ISCIII/relecov-tools/pull/324)
- Pipeline manager now splits data based on enrichment_panel and version. One folder for each group [#324](https://github.com/BU-ISCIII/relecov-tools/pull/324)

#### Removed

- Removed duplicated tests with pushes after PR was merged in test_sftp-handle [#312](https://github.com/BU-ISCIII/relecov-tools/pull/312)
- Deleted deprecated auto-release in pypi_publish as it does not work with tag pushes anymore [#312](https://github.com/BU-ISCIII/relecov-tools/pull/312)
- Removed first sleep time for reconnection decorator in sftp_client.py, sleep time now increases in the second attempt [#321](https://github.com/BU-ISCIII/relecov-tools/pull/321)

### Requirements

## [1.1.0] - 2024-09-13 : https://github.com/BU-ISCIII/relecov-tools/releases/tag/1.1.0

### Credits

Code contributions to the hotfix:
Code contributions to the release:

- [Pablo Mata](https://github.com/Shettland)
- [Sara Monzón](https://github.com/saramonzon)
Expand Down
45 changes: 31 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,8 @@ relecov-tools is a set of helper tools for the assembly of the different element
- [upload-to-ena](#upload-to-ena)
- [upload-to-gisaid](#upload-to-gisaid)
- [update-db](#update-db)
- [launch-pipeline](#launch-pipeline)
- [pipeline-manager](#pipeline-manager)
- [wrapper](#wrapper)
- [logs-to-excel](#logs-to-excel)
- [build-schema](#build-schema)
- [Mandatory Fields](#mandatory-fields)
Expand Down Expand Up @@ -63,7 +64,7 @@ $ relecov-tools --help
\ \ / |__ / |__ | |___ | | | \ /
/ / \ | \ | | | | | | \ /
/ |--| | \ |___ |___ |___ |___ |___| \/
RELECOV-tools version 1.1.0
RELECOV-tools version 1.2.0
Usage: relecov-tools [OPTIONS] COMMAND [ARGS]...
Options:
Expand All @@ -73,16 +74,19 @@ Options:
--help Show this message and exit.
Commands:
download Download files located in sftp server.
read-lab-metadata Create the json compliant to the relecov schema from...
read-bioinfo-metadata Create the json compliant to the relecov schema with Bioinfo Metadata.
validate Validate json file against schema.
map Convert data between phage plus schema to ENA,...
upload-to-ena parsed data to create xml files to upload to ena
upload-to-gisaid parsed data to create files to upload to gisaid
update-db feed database with metadata jsons
build-schema Generates and updates JSON Schema files from...
launch-pipeline Create the symbolic links for the samples which...
download Download files located in sftp server.
read-lab-metadata Create the json compliant to the relecov schema...
validate Validate json file against schema.
map Convert data between phage plus schema to ENA,...
upload-to-ena parse data to create xml files to upload to ena
upload-to-gisaid parsed data to create files to upload to gisaid
update-db upload the information included in json file to...
read-bioinfo-metadata Create the json compliant from the Bioinfo...
metadata-homogeneizer Parse institution metadata lab to the one used...
pipeline-manager Create the symbolic links for the samples which...
wrapper Execute download, read-lab-metadata and validate...
build-schema Generates and updates JSON Schema files from...
logs-to-excel Creates a merged xlsx report from all the log...
```
#### download
The command `download` connects to a transfer protocol (currently sftp) and downloads all files in the different available folders in the passed credentials. In addition, it checks if the files in the current folder match the files in the metadata file and also checks if there are md5sum for each file. Else, it creates one before storing in the final repository.
Expand Down Expand Up @@ -247,10 +251,10 @@ Usage: relecov-tools upload-to-gisaid [OPTIONS]
-t, --type Select the type of information to upload to database [sample,bioinfodata,variantdata]
-d, --databaseServer Name of the database server receiving the data [iskylims,relecov]

#### launch-pipeline
#### pipeline-manager
Create the folder structure to execute the given pipeline for the latest sample batches after executing download, read-lab-metadata and validate modules. This module will create symbolic links for each sample and generate the necessary files for pipeline execution using the information from validated_BATCH-NAME_DATE.json.
```
Usage: relecov-tools launch-pipeline [OPTIONS]
Usage: relecov-tools pipeline-manager [OPTIONS]
Create the symbolic links for the samples which are validated to prepare for
bioinformatics pipeline execution.
Expand All @@ -263,6 +267,19 @@ Options:
--help Show this message and exit.
```

#### wrapper
Execute download, read-lab-metadata and validate sequentially using a config file to fill the arguments for each one. It also creates a global report with all the logs for the three processes in a user-friendly .xlsx format. The config file should include the name of each module that is executed, along with the necessary parameters in YAML format.
```
Usage: relecov-tools wrapper [OPTIONS]
Executes the modules in config file sequentially
Options:
-c, --config_file PATH Path to config file in yaml format [required]
-o, --output_folder PATH Path to folder where global results are saved [required]
--help Show this message and exit.
```

#### logs-to-excel
Creates an xlsx file with all the entries found for a specified laboratory in a given set of log_summary.json files (from log-summary module). The laboratory name must match the name of one of the keys in the provided logs to work.
```
Expand Down
14 changes: 8 additions & 6 deletions relecov_tools/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@
import relecov_tools.upload_ena_protocol
import relecov_tools.pipeline_manager
import relecov_tools.build_schema
import relecov_tools.dataprocess_wrapper

log = logging.getLogger()

Expand Down Expand Up @@ -61,7 +62,7 @@ def run_relecov_tools():
)

# stderr.print("[green] `._,._,'\n", highlight=False)
__version__ = "1.1.0"
__version__ = "1.2.0"
stderr.print(
"\n" "[grey39] RELECOV-tools version {}".format(__version__), highlight=False
)
Expand Down Expand Up @@ -476,12 +477,12 @@ def metadata_homogeneizer(institution, directory, output):
help="select the template config file",
)
@click.option("-o", "--output", type=click.Path(), help="select output folder")
def launch_pipeline(input, template, output, config):
def pipeline_manager(input, template, output, config):
"""
Create the symbolic links for the samples which are validated to prepare for
bioinformatics pipeline execution.
"""
new_launch = relecov_tools.pipeline_manager.LaunchPipeline(
new_launch = relecov_tools.pipeline_manager.PipelineManager(
input, template, output, config
)
new_launch.pipeline_exc()
Expand Down Expand Up @@ -565,22 +566,23 @@ def logs_to_excel(lab_code, output_folder, files):
logsum = relecov_tools.log_summary.LogSum(output_location=output_folder)
merged_logs = logsum.merge_logs(key_name=lab_code, logs_list=all_logs)
final_logs = logsum.prepare_final_logs(logs=merged_logs)
logsum.create_logs_excel(logs=final_logs)
excel_outpath = os.path.join(output_folder, lab_code + "_logs_report.xlsx")
logsum.create_logs_excel(logs=final_logs, excel_outpath=excel_outpath)


@relecov_tools_cli.command(help_priority=16)
@click.option(
"-c",
"--config_file",
type=click.Path(),
help="Path to config file in yaml format",
help="Path to config file in yaml format [required]",
required=True,
)
@click.option(
"-o",
"--output_folder",
type=click.Path(),
help="Path to the base schema file. This file is used as a reference to compare it with the schema generated using this module. (Default: installed schema in 'relecov-tools/relecov_tools/schema/relecov_schema.json')",
help="Path to folder where global results are saved [required]",
required=False,
)
def wrapper(config_file, output_folder):
Expand Down
Binary file not shown.
Loading

0 comments on commit 9acb6cf

Please sign in to comment.