chore: trigger release process #4

github-actions · 2024-11-28T16:22:56Z

⚠️ This PR requires a MERGE COMMIT (Don't squash!)

* feat: enable interface with gcp secrets manager * chore: add google-cloud-secret-manager * chore: merge

* feat: credible set quality filtering * fix: purity threshold

Bumps [dbldatagen](https://github.com/databrickslabs/data-generator) from 0.3.5 to 0.4.0. - [Release notes](https://github.com/databrickslabs/data-generator/releases) - [Changelog](https://github.com/databrickslabs/dbldatagen/blob/master/CHANGELOG.md) - [Commits](databrickslabs/dbldatagen@release/v0.3.5...release/v0.4.0) --- updated-dependencies: - dependency-name: dbldatagen dependency-type: direct:development update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* feat(config): typos fix * fix(config): moved ld_index dataset to static assets --------- Co-authored-by: Szymon Szyszkowski <[email protected]>

* feat(dataset): schema mismatch issue * feat(L2GPrediction): schema unification * fix: swapped data types --------- Co-authored-by: Szymon Szyszkowski <[email protected]>

…ets#645) * feat(SusieFineMapperStep): add new fucntion that takes boundaries as input * fix: typo in function

* refactor: remove ot_pics * refactor: gwas_catalog_sumstat_preprocess config removed * refactor: ot_finngen_studies removed * refactor: ot_finngen_studies removed * refactor: window_based_clumping cleanup

…ets#650)

* feat(ld_annotator): apply r2 threshold * feat(ld_annotator): apply r2 threshold * chore(ldannotator): change threshold to 0.5

…argets#644) * feat(stydyLocus): adding new locus collection using boundaries * fix: fix in test * Update tests/gentropy/dataset/test_study_locus.py Co-authored-by: Szymon Szyszkowski <[email protected]> * chore: pre-commit auto fixes [...] --------- Co-authored-by: Szymon Szyszkowski <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

…targets#651) Sometimes it is very useful to be able to spin up a Jupyter notebook on a Dataproc cluster which has access to Gentropy and all its configured environment. Previously, I already added the “enable_component_gateway” option; however, it doesn't really take any effect unless you also specify the list of components to enable, which is what I'm doing in this PR.

…ets#576) * chore: checkpoint * chore: checkpoint * chore: deprecate spark evaluator * chore: checkpoint * chore: resolve conflicts with dev * chore: resolve conflicts with dev * chore(model): add parameters class property * feat: add module to export model to hub * refactor: make model agnostic of features list * chore: add wandb to gitignore * feat: download model from hub * chore(model): adapt predict method * feat(trainer): add hyperparameter tuning * chore: deprecate trainer tests * refactor: modularise step * feat: download model from hub by default * fix: convert omegaconfig defaults to python objects * fix: write serialised model to disk and then upload to gcs * fix(matrix): drop goldStandardSet when in predict mode * chore: pass token to access private model * chore: pass token to access private model * fix: pass right schema * chore: pre-commit auto fixes [...] * chore: fix mypy issues * build: remove xgboost * chore: merge * chore: pre-commit auto fixes [...] * chore: address comments

) * feat: implement UKB PPP (EUR) ingestion & harmonisation * fix: correct module name for docs * fix: definitely correct module name for docs * test: update output of neglog_pvalue_to_mantissa_and_exponent * fix: test syntax with <BLANKLINE> * Update src/gentropy/datasource/ukb_ppp_eur/summary_stats.py Co-authored-by: Szymon Szyszkowski <[email protected]> * fix: code review updates for docs and version * fix: syntax for concat_ws * style: list harmonisation steps in the docstring * style: rename freq to MAF * style: use concat_ws * style: use two distinct parameters for study index and summary stats output paths --------- Co-authored-by: Szymon Szyszkowski <[email protected]>

Code inspection shows that it is not used anymore.

…entargets#656) Bumps [python-semantic-release/python-semantic-release](https://github.com/python-semantic-release/python-semantic-release) from 9.6.0 to 9.8.3. - [Release notes](https://github.com/python-semantic-release/python-semantic-release/releases) - [Changelog](https://github.com/python-semantic-release/python-semantic-release/blob/master/CHANGELOG.md) - [Commits](python-semantic-release/python-semantic-release@v9.6.0...v9.8.3) --- updated-dependencies: - dependency-name: python-semantic-release/python-semantic-release dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Kirill Tsukanov <[email protected]>

Bumps [pydoclint](https://github.com/jsh9/pydoclint) from 0.4.1 to 0.5.1. - [Release notes](https://github.com/jsh9/pydoclint/releases) - [Changelog](https://github.com/jsh9/pydoclint/blob/main/CHANGELOG.md) - [Commits](jsh9/pydoclint@0.4.1...0.5.1) --- updated-dependencies: - dependency-name: pydoclint dependency-type: direct:development update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: David Ochoa <[email protected]>

* fix: remove check merge conflict * fix: remove line for now

…#654)

* feat(variant annotation): new variant annotation schema + logic to extract from VEP * fix: typehints in function * refactor(variant annotation): migrating methods to the new schema * chore: pre-commit auto fixes [...] * refactor(variant index): sorting out new variant index dataset * chore: pre-commit auto fixes [...] * feature(vep): adding predictors to vep transcript object * fix(schema): fixing schema missing fields * fix(schema): fixing schema missing fields * fix(schema): fixing schema missing fields * fix(schema): fixing schema missing fields * chore: pre-commit auto fixes [...] * fix(annotation): array union under condition * fix: merging dbxref objects * feat(variants): updating variants to make more robust * feat: migrating methods to new variant index * adjusting variant index methods * some updates * rename v2g to variant to gene * chore: pre-commit auto fixes [...] * adding test * chore: pre-commit auto fixes [...] * fix(precommit): json file needed to rename to jsonl * fix(precommit): removing steps depending on old data model * fix(coftest): fixing variant index mock generation * fix: typo in package import * fix: sorting out conftest * refactor(gwas ingest): Updating GnomAD handling * refactor(gnomad): variant annotation removed, changed to variant index, steps updated * refactor: shuffling around gnomad logic * fix: references in tests * refactor: sorting out gnomad variant dag * refactor: cleaning configs and tests * docs(vep): adding datasource description * test(vep): adding more test to the vep parser * test(vep): tests are now running * fix: removing version suffix from pyproject and airflow config * fix: reverting DAGs - removing temporary modifications I added for testing * fix: addressing reviewer comments * refactor: fiddling with variant index annotation logic * chore: addressing comments * fix: variant cross-ref snake case * fix: correcting join strategy --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* fix: typo in cs_lbf_thr parameter name * fix: removing two parameters --------- Co-authored-by: Yakov Tsepilov <[email protected]>

…ets#668)

As it is, when the susie_finemapper step is triggered, nothing happens because the configuration class is not linked via _target_ to the step class. This commit addresses the problem

…ts#608) * feat: custom dockerfile to run ensembl vep * ci: automate vep image build and artifact registry * chore: update airflow google operators (not required) * feat: working version of google batch airflow vep job * feat: working version of google batch airflow vep job * feat(VEP): adding CADD plugin * feat: local loftee file * feat: working with input bucket full of input files * feat: prevent writing html * fix: minor adjustments to retry strategy * feat(airflow): separating mounting points for input/output and cache * fix: typo in airflow dag * fix: pre-commit pain * chore: rename airflow dag file --------- Co-authored-by: DSuveges <[email protected]> Co-authored-by: Szymon Szyszkowski <[email protected]>

* feat: locus_breaker_clumping * fix: dosctring * feat: _process_locus_breaker function * feat: locus breaker clumping step * fix: tidying parameters * feat: option to remove MHC region * fix: description for LocusBreakerClumpingStep * fix: removing division of distance * fix: adding new parameters for wbc distance separate from large_loci_size * fix: resolving comments * refactor: refactored code in process_locus_breaker_output * fix: removing superfluous variable * fix: persisting sumstats parquet to improve analysis plan --------- Co-authored-by: Yakov <[email protected]>

* feat: add qc step * fix: remove .df * fix: fix in name * fix: fix v3 * Update src/gentropy/sumstat_qc_step.py Co-authored-by: Daniel Suveges <[email protected]> * Update src/gentropy/sumstat_qc_step.py Co-authored-by: Daniel Suveges <[email protected]> * fix: optimisation of code --------- Co-authored-by: Daniel Suveges <[email protected]>

…targets#677) * feat: adding sanity filter to GWASCatalogSumstatsPreprocessStep * fix: adding description

* fix: improving locus_breaker_step logic * fix: updating susie_finemapper.py to deal with new I/O logic * chore: removing unused log output path

…ant index to avoid null genes (opentargets#890)

…ets#895) * feat(feature_matrix): impute values for gene attribute cols + semantic test * fix: change window * chore: fill na in the feature matrix generation step

* feat: adding l2g features to prediction table * fix: renaming method for better name * fix: remove show statement * fix: dropping locusToGeneFeatures if already exist * feat: dropping features with null values from the map

Co-authored-by: Szymon Szyszkowski <[email protected]> Co-authored-by: Daniel Suveges <[email protected]>

…entargets#901)

* fix: fix col names for imputation * fix: fix v1 * fix: test

* fix: tweak sc vs bulk eqtl catalogue logic * fix: update tests * fix: correct coloc calling of method * fix: address PR comments * fix: change string to col * fix: change eqtl catalogue path to specific commit * chore: fix method description

* feat: changing to 99 credible sets * fix: change summary schema * fix: adding purity metrics * fix: updating test data samples * fix: updating test data samples * Update finemapping.py

…s#905)

Co-authored-by: Yakov <[email protected]>

…ntargets#907)

Co-authored-by: Szymon Szyszkowski <[email protected]>

…ital PICS) (opentargets#910) * feat: add OUT_OF_SAMPLE_LD QC flag to PICS credible sets * feat: change pics finemapping method to PICS * test: change pics to PICS in test data * fix: flag studies without sumstats without relying on hasSumstats column * fix: flag studies without sumstats without using update_quality_flag function

* feat(gold_standard): filter by protein coding genes * feat: arbitrary gold standards * feat: read model from gcs * feat: read model from gcs * feat: get untrusted types from blob * revert: changes to gene_index * fix: correct list of missing and unexpected fields * chore: addressing comments * fix: selective check on the schema issues --------- Co-authored-by: Szymon Szyszkowski <[email protected]>

* feat: gzip evicence output to match existing format * docs: added info about compression to docstring --------- Co-authored-by: Szymon Szyszkowski <[email protected]>

…st metric + other fixes (opentargets#913) * feat: mean to max * fix: remove protein coding * fix: adding protein coding * feat(l2g): neighbourhood features are a division between local and regional * feat(l2g): regional max for distance features only consider protein coding genes * fix(coloc_features): regional max for coloc features only consider protein coding genes * fix(vep_features): regional max for vep features only consider protein coding genes * feat(l2g): train and predict based on protein coding genes only * feat: set nbh feature to 1 if features are 0 in the region * feat: set nbh feature to 1 if features are 0 in the region * Revert "feat: set nbh feature to 1 if features are 0 in the region" This reverts commit da145ab. * fix: return nbh features only for protein coding genes + optimisation * test: change expected results based on changes * test: change expected results based on changes * fix: test --------- Co-authored-by: Yakov Tsepilov <[email protected]>

Bumps [codecov/codecov-action](https://github.com/codecov/codecov-action) from 4 to 5. - [Release notes](https://github.com/codecov/codecov-action/releases) - [Changelog](https://github.com/codecov/codecov-action/blob/main/CHANGELOG.md) - [Commits](codecov/codecov-action@v4...v5) --- updated-dependencies: - dependency-name: codecov/codecov-action dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Bumps [pytest-cov](https://github.com/pytest-dev/pytest-cov) from 5.0.0 to 6.0.0. - [Changelog](https://github.com/pytest-dev/pytest-cov/blob/master/CHANGELOG.rst) - [Commits](pytest-dev/pytest-cov@v5.0.0...v6.0.0) --- updated-dependencies: - dependency-name: pytest-cov dependency-type: direct:development update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

updates: - [github.com/astral-sh/ruff-pre-commit: v0.7.1 → v0.7.3](astral-sh/ruff-pre-commit@v0.7.1...v0.7.3) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

…ences in transcripts (opentargets#914) * feat: extending the VEP schema * feat(vep parser): adding logic to build variant description based on VEP annotation * fix: remove commented lines * fix: improving consequence to so term mapping * fix: nullified variant descriptions * fix: assessment_flag_column_name type fix * chore: pre-commit auto fixes [...] * feat: adding formatting to distances in description * fix: formatting * fix: variant index schema * fix: conftest for variant index * feat(variant index): normalising assessments of in-silico predictors * feat: adding VEP predictor * fix: variant test config * fix: variant test config * fix: schema type * fix: dropping failing test * fix: variant annotatin * fix: gnomad variant index repartition --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* fix: r2 for lead varaint is always 1 * fix: removing not needed quality flag * test: removing unused condition * fix: type: ignore --------- Co-authored-by: DSuveges <[email protected]>

* feat: reverting to 95% finngen credible sets * fix: updating tests and column names

…ntargets#921) * feat: changing studylocus validation to 95 percent credible sets * fix: updating comment in code to reflect 95% credset * fix: removing credset number of partitions * fix: flag name --------- Co-authored-by: Yakov Tsepilov <[email protected]>

updates: - [github.com/astral-sh/ruff-pre-commit: v0.7.3 → v0.7.4](astral-sh/ruff-pre-commit@v0.7.3...v0.7.4) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

…frequencies (opentargets#929) * fix: gnomad 4.1 frequencies * fix: removing in-silico extraction in gnomad * fix: removing in silico predictor ingestion from gnomad pre-process

…ts#924) * feat(gold_standard): add traitFromSourceMappedId to schema * chore: adapt tests * feat(feature_matrix): consider `traitFromSourceMappedId` a static column * feat(feature_matrix): consider `traitFromSourceMappedId` an optional column

Co-authored-by: Szymon Szyszkowski <[email protected]>

ireneisdoomed and others added 30 commits June 11, 2024 12:33

feat: enable interface with gcp secrets manager (opentargets#635)

ca43fff

* feat: enable interface with gcp secrets manager * chore: add google-cloud-secret-manager * chore: merge

feat: credible set quality filtering (opentargets#640)

45d991c

* feat: credible set quality filtering * fix: purity threshold

feat(config): 24.06 data release fixes (opentargets#639)

3c8ce58

* feat(config): typos fix * fix(config): moved ld_index dataset to static assets --------- Co-authored-by: Szymon Szyszkowski <[email protected]>

fix(L2GPrediction): schema validation (opentargets#642)

7625a79

* feat(dataset): schema mismatch issue * feat(L2GPrediction): schema unification * fix: swapped data types --------- Co-authored-by: Szymon Szyszkowski <[email protected]>

feat(SusieFineMapperStep): add new function with boundaries (opentarg…

79a6cb5

…ets#645) * feat(SusieFineMapperStep): add new fucntion that takes boundaries as input * fix: typo in function

feat: exclude region for StudyLocus object (opentargets#646)

d796b68

refactor: delete unnecessary config files (opentargets#647)

dc70bd8

* refactor: remove ot_pics * refactor: gwas_catalog_sumstat_preprocess config removed * refactor: ot_finngen_studies removed * refactor: ot_finngen_studies removed * refactor: window_based_clumping cleanup

build: remove python-semantic-release as project dependency (opentarg…

ca377ce

…ets#650)

feat(ld_annotator): optional r2 threshold (opentargets#648)

6d93192

* feat(ld_annotator): apply r2 threshold * feat(ld_annotator): apply r2 threshold * chore(ldannotator): change threshold to 0.5

chore: remove the locus_radius parameter (opentargets#659)

b839164

Code inspection shows that it is not used anymore.

ci: pre-commit update with pydoclint adjustments (opentargets#660)

08eaaff

fix: remove check merge conflict from pre-commit (opentargets#661)

41cb35b

* fix: remove check merge conflict * fix: remove line for now

fix(SusieFineMapperStep): adding filtering of NANs in LD (opentargets…

b3e89bb

…#654)

fix: typo in cs_lbf_thr parameter name (opentargets#667)

f0a2902

* fix: typo in cs_lbf_thr parameter name * fix: removing two parameters --------- Co-authored-by: Yakov Tsepilov <[email protected]>

fix: make extended_spark_conf an empty dict instead of None (opentarg…

3f1cbe4

…ets#668)

fix(finemapping): link configuration and step classes (opentargets#669)

768b5d2

As it is, when the susie_finemapper step is triggered, nothing happens because the configuration class is not linked via _target_ to the step class. This commit addresses the problem

feat: adding sanity filter to GWASCatalogSumstatsPreprocessStep (open…

1f1088a

…targets#677) * feat: adding sanity filter to GWASCatalogSumstatsPreprocessStep * fix: adding description

fix: leaving only five ancestries in LD (opentargets#680)

a0537c2

fix: improving locus_breaker_step logic (opentargets#679)

a9d27d7

* fix: improving locus_breaker_step logic * fix: updating susie_finemapper.py to deal with new I/O logic * chore: removing unused log output path

ireneisdoomed and others added 29 commits November 4, 2024 16:12

fix(credibleSetConfidence): inner join between study locus and vari…

3639b23

…ant index to avoid null genes (opentargets#890)

feat(feature_matrix): impute values for gene attribute cols (opentarg…

04b1e22

…ets#895) * feat(feature_matrix): impute values for gene attribute cols + semantic test * fix: change window * chore: fill na in the feature matrix generation step

fix: ensure the #CHROM is not quoted (opentargets#896)

4d8e7c4

Co-authored-by: Szymon Szyszkowski <[email protected]> Co-authored-by: Daniel Suveges <[email protected]>

feat(feature_matrix): extract features for gwas associations only (op…

2af1074

…entargets#901)

fix: do not impute isProteinCoding (opentargets#902)

6ec0d45

* fix: fix col names for imputation * fix: fix v1 * fix: test

feat: improve partitioning of credible sets (opentargets#900)

ebde0da

fix: using the 99% PIP cs column, (opentargets#904)

0d3c01b

* feat: changing to 99 credible sets * fix: change summary schema * fix: adding purity metrics * fix: updating test data samples * fix: updating test data samples * Update finemapping.py

chore: add hf_model_commit_message to LocusToGeneStep (opentarget…

93de448

…s#905)

refactor: finemapping method enum (opentargets#897)

b5b71f0

Co-authored-by: Yakov <[email protected]>

chore(l2g): parametrise score threshold when writing predictions (ope…

0e7e815

…ntargets#907)

chore: validate chromosome (opentargets#906)

bb609cb

feat: extract pos and chromosome from variantid (opentargets#909)

10b4be0

Co-authored-by: Szymon Szyszkowski <[email protected]>

feat: gzip evicence output to match existing format (opentargets#915)

c46480b

* feat: gzip evicence output to match existing format * docs: added info about compression to docstring --------- Co-authored-by: Szymon Szyszkowski <[email protected]>

chore: pre-commit autoupdate (opentargets#898)

4104ce3

updates: - [github.com/astral-sh/ruff-pre-commit: v0.7.1 → v0.7.3](astral-sh/ruff-pre-commit@v0.7.1...v0.7.3) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

fix: r2 for lead variant is always 1 (opentargets#919)

a858662

* fix: r2 for lead varaint is always 1 * fix: removing not needed quality flag * test: removing unused condition * fix: type: ignore --------- Co-authored-by: DSuveges <[email protected]>

feat: reverting to using finngen 95% credible sets (opentargets#922)

b6303d5

* feat: reverting to 95% finngen credible sets * fix: updating tests and column names

chore: pre-commit autoupdate (opentargets#918)

8a83ec6

updates: - [github.com/astral-sh/ruff-pre-commit: v0.7.3 → v0.7.4](astral-sh/ruff-pre-commit@v0.7.3...v0.7.4) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

chore(gnomad): updating GnomAD version to 4.1 from 4.0 + using joint …

008aa38

…frequencies (opentargets#929) * fix: gnomad 4.1 frequencies * fix: removing in-silico extraction in gnomad * fix: removing in silico predictor ingestion from gnomad pre-process

feat: coalescing the datasets (opentargets#932)

7b3bfad

Co-authored-by: Szymon Szyszkowski <[email protected]>

github-actions bot added the auto-pr label Nov 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: trigger release process #4

chore: trigger release process #4

github-actions bot commented Nov 28, 2024

chore: trigger release process #4

Are you sure you want to change the base?

chore: trigger release process #4

Conversation

github-actions bot commented Nov 28, 2024