Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conda recipe #404

Merged
merged 4 commits into from
Dec 7, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions bin/cmat/VERSION
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
3.0.6.dev4
41 changes: 41 additions & 0 deletions bin/cmat/cmat
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
#!/usr/bin/env bash

# This is a wrapper around CMAT for packaging the Conda recipe, based largely on the
# ones in Andries Feder's Cladebreaker (https://github.com/andriesfeder/cladebreaker)
# and Robert A. Petit III's Bactopia (https://bactopia.github.io).

CONDA_ENV=$(which cmat | sed 's=bin/cmat==')
VERSION=$(cat "${CONDA_ENV}/bin/VERSION")
CMAT_NF="${CONDA_ENV}/share/cmat-${VERSION}/pipelines"
MAPPINGS_FILE="${CONDA_ENV}/share/cmat-${VERSION}/mappings/latest_mappings.tsv"

if [[ $# == 0 ]]; then
echo "ClinVar Mapping and Annotation Toolkit (cmat) - v${VERSION}"
echo ""
echo "Available commands (use --help to print usage):"
echo " * cmat annotate - Annotate ClinVar XML file"
echo " * cmat generate-curation - Generate term curation spreadsheet"
echo " * cmat export-curation - Export term curation spreadsheet"
echo ""
exit
fi

if [[ "$1" == "version" ]] || [[ "$1" == "--version" ]]; then
echo "cmat ${VERSION}"
exit
fi

# All other commands take an optional --mappings arg
# If not present, use the latest mappings file included with CMAT
MAPPINGS_ARG="--mappings ${MAPPINGS_FILE}"
if [[ "$*" == *"--mappings"* ]]; then
MAPPINGS_ARG=""
fi

if [[ "$1" == "annotate" ]]; then
nextflow run "${CMAT_NF}/annotation_pipeline.nf" "${@:1}" ${MAPPINGS_ARG}
elif [[ "$1" == "generate-curation" ]]; then
nextflow run "${CMAT_NF}/generate_curation_spreadsheet.nf" "${@:1}" ${MAPPINGS_ARG}
elif [[ "$1" == "export-curation" ]]; then
nextflow run "${CMAT_NF}/export_curation_spreadsheet.nf" "${@:1}" ${MAPPINGS_ARG}
fi
2 changes: 1 addition & 1 deletion cmat/consequence_prediction/common/biomart.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ def process_biomart_request(query):
# If there was an HTTP error, raise an exception. This will be caught by @retry.
result.raise_for_status()
# Some errors from BioMart come back as 200 but with an error message in the content.
if result.text.lower().startswith('query error'):
if result.text.lower().startswith('query error') or result.text.lower().startswith('<html>'):
raise requests.exceptions.HTTPError(result.text)
return result.text

Expand Down
11 changes: 11 additions & 0 deletions conda/build.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
#!/bin/bash

$PYTHON -m pip install .

CMAT="${PREFIX}/share/${PKG_NAME}-${PKG_VERSION}"
mkdir -p ${PREFIX}/bin ${CMAT}

chmod 775 bin/cmat/*
cp bin/cmat/* ${PREFIX}/bin

mv bin/ mappings/ pipelines/ ${CMAT}
apriltuesday marked this conversation as resolved.
Show resolved Hide resolved
61 changes: 61 additions & 0 deletions conda/meta.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
{% set version = "3.0.6.dev4" %}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just in case it's a hassle for you to change this manually. Now that we have the version within the file VERSION, would it be handy to have a very simple python script that replaces this bit of version in the meta.yaml, triggered by a GitHub action?

For example, something like the following, summarised by our friend GPT:

import re
version_file_path = 'VERSION'
yaml_file_path = 'meta.yaml'

with open(version_file_path, 'r') as file:
    version = file.read().strip()

with open(yaml_file_path, 'r') as file:
    yaml_content = file.read()

new_yaml_content = re.sub(r'{% set version = ".*" %}', f'{{% set version = "{version}" %}}', yaml_content)

with open(yaml_file_path, 'w') as file:
    file.write(new_yaml_content)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, only if it's less of a hassle to keep in mind that the VERSION is also in the YAML file

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or actually, rather than modify this YAML file, why not have the other files read the file meta.yaml instead of VERSION? That way, the version exists only in one place. We could use a regex to find the version within this yaml file and extract it instead

Copy link
Contributor Author

@apriltuesday apriltuesday Dec 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's okay for now, but it's something to keep in mind for a future PR (e.g. in #406)... we can decide the direction of information flow (i.e. from VERSION to YAML or vice versa) and how it's triggered there.


package:
name: cmat
version: {{ version }}

source:
url: https://github.com/EBIvariation/CMAT/archive/v{{version}}.tar.gz
sha256: 29bdeb28674486785c5f5825afa7d1237bd5dd2c76923145d68543f3a6bb5594

build:
number: 0
noarch: generic

requirements:
host:
- nextflow >=21.10
- python >=3.8,<3.10 # restriction from biopython
# From requirements.txt
- biopython==1.77
- coverage==6.5.0
- coveralls==3.3.1
- jsonschema==3.2.0
- numpy==1.24.3
- pandas==1.5.3
- pytest==7.2.2
- pytest-cov==2.10.0
- requests==2.31.0
- requests-mock==1.8.0
- retry==0.9.2
run:
- nextflow >=21.10.0
- python >=3.8,<3.10
- biopython==1.77
- coverage==6.5.0
- coveralls==3.3.1
- jsonschema==3.2.0
- numpy==1.24.3
- pandas==1.5.3
- pytest==7.2.2
- pytest-cov==2.10.0
- requests==2.31.0
- requests-mock==1.8.0
- retry==0.9.2
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why these requirements have to be present in both host and run sections, but it's the only way I could get it to work...


test:
imports:
- cmat
commands:
- cmat
- cmat annotate --help

about:
home: https://github.com/EBIvariation/CMAT
summary: ClinVar Mapping and Annotation Toolkit
license: Apache-2.0
license_file: LICENSE

extra:
recipe-maintainers:
- apriltuesday
4 changes: 3 additions & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@
# allow setup.py to be run from any path
os.chdir(os.path.normpath(os.path.join(os.path.abspath(__file__), os.pardir)))

version = open(os.path.join(os.path.abspath(os.path.dirname(__file__)), 'bin', 'cmat', 'VERSION')).read().strip()


def get_requires():
requires = []
Expand All @@ -27,7 +29,7 @@ def get_requires():
long_description = fh.read()

setup(name='cmat',
version='3.0.5',
version=version,
author_email='[email protected]',
url='https://github.com/EBIvariation/CMAT',
packages=find_packages(),
Expand Down
Loading