Skip to content

Commit

Permalink
common.smk: Conditionally include --no-sign-request option
Browse files Browse the repository at this point in the history
Conditionally include `--no-sign-request` option for download rules
if the `s3_src` config param is pointing to the public Nextstrain S3
bucket (s3://nextstrain-data/).

This means AWS credentials are no longer necessary for the public
h5n1-cattle-outbreak genome build.
  • Loading branch information
joverlee521 committed Jun 13, 2024
1 parent a4f85d2 commit c06caae
Show file tree
Hide file tree
Showing 2 changed files with 23 additions and 8 deletions.
9 changes: 3 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,12 +69,9 @@ Specifically, the files needed are `ingest/results/metadata.tsv` and `ingest/res
Run full genome builds with the following command.

``` bash
nextstrain build \
--env AWS_ACCESS_KEY_ID \
--env AWS_SECRET_ACCESS_KEY \
. \
--snakefile Snakefile.genome \
--config s3_src=s3://nextstrain-data/files/workflows/avian-flu/h5n1
nextstrain build . \
--snakefile Snakefile.genome \
--config s3_src=s3://nextstrain-data/files/workflows/avian-flu/h5n1
```

Currently this is only set up for the "h5n1-cattle-outbreak" build using NCBI data,
Expand Down
22 changes: 20 additions & 2 deletions rules/common.smk
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,18 @@ def subtypes_by_subtype_wildcard(wildcards):
}
return(db[wildcards.subtype])


def no_sign_request(_wildcards):
"""
Returns the `--no-sign-request` option for AWS CLI if
the requested `S3_SRC` poins to the NEXTSTRAIN_PUBLIC_BUCKET.
"""
NEXTSTRAIN_PUBLIC_BUCKET = "s3://nextstrain-data/"
if S3_SRC.startswith(NEXTSTRAIN_PUBLIC_BUCKET):
return "--no-sign-request"
return ""


if LOCAL_INGEST:
rule copy_sequences_from_ingest:
output:
Expand Down Expand Up @@ -39,19 +51,25 @@ else:
sequences = "data/{segment}/sequences.fasta",
params:
s3_src=S3_SRC,
no_sign_request=no_sign_request,
shell:
"""
aws s3 cp {params.s3_src:q}/{wildcards.segment}/sequences.fasta.zst - | zstd -d > {output.sequences}
aws s3 cp {params.no_sign_request} \
{params.s3_src:q}/{wildcards.segment}/sequences.fasta.zst - \
| zstd -d > {output.sequences}
"""

rule download_metadata:
output:
metadata = "data/metadata.tsv",
params:
s3_src=S3_SRC,
no_sign_request=no_sign_request,
shell:
"""
aws s3 cp {params.s3_src:q}/metadata.tsv.zst - | zstd -d > {output.metadata}
aws s3 cp {params.no_sign_request} \
{params.s3_src:q}/metadata.tsv.zst - \
| zstd -d > {output.metadata}
"""

rule filter_sequences_by_subtype:
Expand Down

0 comments on commit c06caae

Please sign in to comment.