Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

problems of sample names for combine_mpa.py #23

Open
chunchenghuang opened this issue Sep 25, 2020 · 1 comment
Open

problems of sample names for combine_mpa.py #23

chunchenghuang opened this issue Sep 25, 2020 · 1 comment

Comments

@chunchenghuang
Copy link

Hi,
Thank you for your script, your script has been extremely helpful. However I was trying to use kreport2mpa.py with --display-header for 63 samples, combine_mpa.py, which is said to be able to use the header automatically in the manual reported errors:
Number of files to parse: 63

dir="."
for report in $(find $dir -maxdepth 1 -name "*_report.txt"|sort -V);do
report=${report//_report.txt/};
#ktImportTaxonomy -m 3 -t 5 ${report}_report.txt -o ${report}_krona.html
kreport2mpa.py --display-header --no-intermediate-ranks -r ${report}_report.txt -o ${report}_mpa.txt
done
combine_mpa.py -i *_report_mpa.txt -o combined_mpa.tmp

Traceback (most recent call last):
  File "/opt/KrakenTools/combine_mpa.py", line 142, in <module>
    main()
  File "/opt/KrakenTools/combine_mpa.py", line 83, in main
    [classification, val] = line.strip().split('\t')
ValueError: too many values to unpack

This is my bash script for running the thing (I tried to sort my files by sort -V in bash to replace the header by names or the sample input order manually without the --display-header, commands but the output order seems to be different anyway)

dir="."
for report in $(find $dir -maxdepth 1 -name "*_report.txt"|sort -V);do
    report=${report//_report.txt/};
    #ktImportTaxonomy -m 3 -t 5 ${report}_report.txt -o ${report}_krona.html
    kreport2mpa.py --no-intermediate-ranks -r ${report}_report.txt -o ${report}_mpa.txt
done

file_list=$(echo *_report.txt|tr " " "\n" | sort -V |tr "\n" " " )
echo *_report.txt | tr " " "\n" | cut -d "_" -f1 | sort -V > sample.header
echo "#Classification" > tmp.tmp
cat tmp.tmp sample.header | tr "\n" "\t" > table.header
combine_mpa.py -i $file_list -o combined_mpa.tmp
sed 1d combined_mpa.txt > table.tmp
cat table.header table.tmp > combined_mpa.txt
rm *.header
rm *.tmp
@mars188
Copy link

mars188 commented May 27, 2021

were you able to find a solution?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants