Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The effect of --min-seq-id parameter #959

Open
jianye00 opened this issue Feb 20, 2025 · 3 comments
Open

The effect of --min-seq-id parameter #959

jianye00 opened this issue Feb 20, 2025 · 3 comments

Comments

@jianye00
Copy link

If I use cluster workflow, does the --min-seq-id parameter guarantee the alignment identity between the representative and cluster members to be at least what is specified by --min-seq-id ?

@milot-mirdita
Copy link
Member

With the default cascaded clustering, this property can be broken due to transitivity issues with the repeated clustering. We have implemented an extended workflow that is enabled with --cluster-reassign that fixes these issues after the clustering is complete.

@jianye00
Copy link
Author

Hi, I suppose you were answering my other issue #961 here about missed sequence after clusterupdate.

I tried --cluster-reassign flag too but it gives the same result as without it. Here is my command line:

bin/mmseqs clusterupdate --cluster-reassign 1 sequenceDB updateSequenceDB clusterDB newSequenceDB.cluster-reassign newClusterDB.cluster-reassign tmp

bin/mmseqs createtsv newSequenceDB.cluster-reassign newSequenceDB.cluster-reassign newClusterDB.cluster-reassign newClusterDB.cluster-reassign.tsv

wc newClusterDB.cluster-reassign.tsv
7313 14626 123018 newClusterDB.cluster-reassign.tsv
grep X7SER2 newClusterDB.cluster-reassign.tsv

As you can see, the cluster result using the "--cluster-reassign 1" flag generates 7313 clusters which is the same as without using it as shown in #961 and the entry X7SER2 is still missing.

@milot-mirdita
Copy link
Member

We are currently looking at the other issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants