Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enable protein sketches for pangenome hash correlations #3201

Open
AnneliektH opened this issue Jun 10, 2024 · 4 comments
Open

enable protein sketches for pangenome hash correlations #3201

AnneliektH opened this issue Jun 10, 2024 · 4 comments

Comments

@AnneliektH
Copy link

AnneliektH commented Jun 10, 2024

calc-hash-presence.py from pangenome-hash-cor, does not yet allow me to compare protein sketches instead of nucleotide sketches.

I'd like to compare protein-pangenomes instead of nucleotide ones

Trying in /group/ctbrowngrp2/scratch/annie/2023-swine-sra/sourmash/pangenomics/test_virpan

python ../../2024-pangenome-hash-corr/calc-hash-presence.py \
cluster1371.ranktable.csv \
cluster1371.zip \
-o cluster1371.dump \
--scaled=1

loaded 204371 hashvals... downsampling soon.
found 0 metagenomes
Traceback (most recent call last):
  File "/group/ctbrowngrp2/scratch/annie/2023-swine-sra/sourmash/pangenomics/test_virpan/sigs/../../2024-pangenome-hash-corr/calc-hash-presence.py", line 80, in <module>
    sys.exit(main())
  File "/group/ctbrowngrp2/scratch/annie/2023-swine-sra/sourmash/pangenomics/test_virpan/sigs/../../2024-pangenome-hash-corr/calc-hash-presence.py", line 37, in main
    query_minhash = next(iter(idx.signatures())).minhash.copy_and_clear()
StopIteration
@ctb
Copy link
Contributor

ctb commented Jun 12, 2024

just added it now - ctb/2024-pangenome-hash-corr@7440f8f

you'll need to update your sourmash_plugin_pangenomics first, either to the latest released version or the latest dev version.

@ctb
Copy link
Contributor

ctb commented Jun 12, 2024

(let me know if it breaks ;)

@ctb
Copy link
Contributor

ctb commented Jun 12, 2024

hmm, some more things need to be fixed before it will work. sorry!

@ctb
Copy link
Contributor

ctb commented Jun 12, 2024

ok fixed in 2024-pangenome-hash-corr ctb/2024-pangenome-hash-corr@f2433f2 and v0.2.2 of sourmash_plugin_pangenomics, I think.

you may need to use --protein --no-dna with the pangenome plugin scripts. You'll also need to specify the k-mer size and scaled value far too many times. Sorry, work in progress ;).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants