Information from "kegg-column", "ko-column" and "ec-column" is now all combined
Multiple new columns are now outputted, depending on the source of information, e.g., KO (kegg-column)
contains the KOs obtained from the IDs on the column specified with -keggc
.
All KOs obtained are grouped into the KO (KEGGCharter)
column, now the only used for charting functions.
Multiple IDs in the same cell now accepted and considered properly
Comma ,
is the only delimiter accepted for parsing multiple IDs inside the same cell.
Multiple KEGG IDs were accepted before, if separated by semi-comma (;
). This is now deprecated, and they most come comma-separated.
"Data" dataframe extends and compresses with each cycle of ID conversion.
Simplified input of quantification columns
No more --genomic-columns
nor --transcriptomic-columns
, only --quantification-columns
(-tcols
) now.
All maps ("potential" and "differential") are produced for those columns.
"gene" features now also mapped
KEGGCharter was only considering the orthologs
attribute of the Pathway
instances, but some boxes are present in the KGML as gene
features. Now, KEGGCharter considers those as well.
Reestructured the repo, simplified CICD, improved output to the command line, performance improvements
Maps inside resources
folder, all yamls and CI files in cicd
folder.
Much smaller keggcharter_input.tsv
is still enough to build nice maps.
Had to specify version of libarchive
(3.6.2=h039dbb9_1
) in the Dockerfile.
More comprehensive messages.
Lighter progress bars.
--map-all
workflow was running write_kgmls
function for all taxa. Simply runs for ko
now, and associates information to all taxa. Much faster, less dumber.