You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently we require as input a gdna FASTA file, a GFF3 file, and a protein file. However, if we are only interested in genome organization or otherwise a quick view, the only input that is strictly necessary is the GFF3 annotation file; after all, most of our calculations are simply done on the ranges indicated in the GFF3 file, independent of sequence content.
I suggest we have something like a "--terse" flag (possibly implicit by what is provided as input) which gives all the functionality that does not rely on sequence input. The necessary changes to the code do not seem to be extensive: largely it would be to have a version of fidibus-stats.py that does everything as before except for the sequence-dependent calculations (GCcontent, GCskew, Ncontent; checks on sequence lengths); a few if statements should do. Before that the iloci and breakdown steps would need to omit the sequences(db) calls from prepare(db).
A quick way to try this might be to test a code version in which we delete all the calls involving the sequence data. I think the code is so well organized that this change should not be difficult to implement (unless hidden dependencies show up ...).
Benefit: This change would allow us to hugely speed up sigmaphi determination, saving download to disk in the first place plus subsequent disk writing.
The text was updated successfully, but these errors were encountered:
Currently we require as input a gdna FASTA file, a GFF3 file, and a protein file. However, if we are only interested in genome organization or otherwise a quick view, the only input that is strictly necessary is the GFF3 annotation file; after all, most of our calculations are simply done on the ranges indicated in the GFF3 file, independent of sequence content.
I suggest we have something like a "--terse" flag (possibly implicit by what is provided as input) which gives all the functionality that does not rely on sequence input. The necessary changes to the code do not seem to be extensive: largely it would be to have a version of fidibus-stats.py that does everything as before except for the sequence-dependent calculations (GCcontent, GCskew, Ncontent; checks on sequence lengths); a few if statements should do. Before that the iloci and breakdown steps would need to omit the sequences(db) calls from prepare(db).
A quick way to try this might be to test a code version in which we delete all the calls involving the sequence data. I think the code is so well organized that this change should not be difficult to implement (unless hidden dependencies show up ...).
Benefit: This change would allow us to hugely speed up sigmaphi determination, saving download to disk in the first place plus subsequent disk writing.
The text was updated successfully, but these errors were encountered: