Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

updating a cluster database #315

Open
dadush1 opened this issue Jul 29, 2024 · 0 comments
Open

updating a cluster database #315

dadush1 opened this issue Jul 29, 2024 · 0 comments

Comments

@dadush1
Copy link

dadush1 commented Jul 29, 2024

Hi!

Thanks again for this wonderful tool!

I used easy-cluster to cluster many structures into a cluster DB. I want to update the cluster DB by clustering additional .pdb files.
How can I incrementally add the structures to the structure DB without the need to re-cluster everything?

I guess one very simple option is to create a DB of cluster representatives and to search against it to map to the closest existing representative.

But I also saw that foldseek has updatecluster. I tried using it but it didn't work. I got an error saying:
Database db1/ needs header information
which sounds like it was expecting a fasta file, and not a .pdb structure folder

Did I get wrong? How could I update the cluster DB elegantly?

this is the command I ran:
foldseek clusterupdate db1/ db2/ clustered_1_cluster.tsv ./mapped ./new_clustered_2 tmp

db1 and db2 are folders with pdb files. clustered_1_cluster.tsv is the output of the first clustering which I wanted to add to it new members.

foldseek Version: ca58f9b

Thank you very much!
David

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant