You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I noticed -f option have a large impact on the runtime and memory. Does -f have the same meaning as option -f in minimap? Also, what is the default value of -f, currently it says: -f FLOAT,... occurrence thresholds [0.05,0.01,0.001]
The text was updated successfully, but these errors were encountered:
Thank you for trying minialign 😄.
Right. And I'm sorry the document on the options and algorithms is unkind for users.
Does -f have the same meaning as option -f in minimap?
Each value of the -f option list (0.05, 0.01, and 0.001) has the same definition as the minimap, the fraction of top-N occurring minimizers.
In the minialign, seed (minimizer) collection is performed multiple times. With the default (0.05, 0.01, 0.001) setting, in the first seed collection trial, seeds occurring less than top 5% are gathered, chained, and extended. Only if any meaningful alignment is not found in the first trial, the next one with 1% threshold (and finally 0.1%) are executed. This strategy effectively avoid chain confusion around repetitive regions and efficiently reduces calculation time (since the number of collected seeds are dramatically reduced in the first stage). In other words however, this also means changing -f threshold greatly affects on calculation time and memory usage as you pointed out.
At least I have to write detailed explanation on the multiple seed collection and its thresholds. I'll write as soon as possible.
Firstly, thanks for creating minialign!
I noticed
-f
option have a large impact on the runtime and memory. Does-f
have the same meaning as option -f in minimap? Also, what is the default value of-f
, currently it says:-f FLOAT,... occurrence thresholds [0.05,0.01,0.001]
The text was updated successfully, but these errors were encountered: