Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

script to plot CER from training logfile #203

Merged
merged 6 commits into from
Nov 17, 2020
Merged

script to plot CER from training logfile #203

merged 6 commits into from
Nov 17, 2020

Conversation

Shreeshrii
Copy link
Collaborator

as suggested in #200 (comment)

@Shreeshrii
Copy link
Collaborator Author

CER Data was extracted as follows:

grep 'Eval Char' /home/ubuntu/tess5training-iast/LAYER.log | sed -e 's/^.*[0-9]At iteration //' | \sed -e 's/,.* Eval Char error rate=/\t/'  | sed -e 's/, Word.*$//' | sed -e 's/^/\t\t/'> plot-eval.txt
grep 'best model' /home/ubuntu/tess5training-iast/LAYER.log |  sed  -e 's/^.*\///' |  sed  -e 's/\.checkpoint.*$//' | sed  -e 's/_/\t/g' | sed -e 's/\(.*\)\t\(.*\)/\1/' > plot-best.txt
grep 'At iteration' /home/ubuntu/tess5training-iast/LAYER.log |  sed -e '/^Sub/d' |  sed -e '/^Update/d' | sed  -e 's/At iteration \([0-9]*\).*char train=/\t\t\1\t\t/' |  sed  -e 's/%, word.*$//'   > plot-iteration.txt
sed 'N;s/\nAt iteration 0, stage 0, /At iteration 0, stage 0, /;P;D' /home/ubuntu/tess5training-iast/CHECKeval.test.log | grep 'Eval Char' | sed -e 's/.checkpoint.*Eval Char error rate=/\t\t\t/' | sed -e 's/, Word.*$//' | sed  -e 's/\(^.*\)_\(.*\)_\(.*\)\t/\1\t\t\2\t\t\t/g' > plot-validation.txt
cat plot-header.txt plot-validation.txt  plot-best.txt plot-eval.txt plot-iteration.txt > plot_cer.csv
python plot_cer.py

plot_cer.csv.txt
plot_cer

Copy link
Collaborator

@kba kba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, but can you maybe add the documentation how this is to be used into the README, including how you ran the training process with |tee -a LAYER.log?

@Shreeshrii
Copy link
Collaborator Author

@kba I have added the info about how to run this in README.

|tee -a LAYER.log does not capture the output from lstmtraining. I use nohup for capturing all output.

LAYER training is NOT currently supported by Makefile. I had used logfile from an independently run `replace top layer training'.

I used the sample set provided in this repo and ran training from scratch as well as with START-MODEL. Those plots look a bit different. See below.

ocrd-plot_cer

@kba kba requested review from stweil and wrznr November 17, 2020 13:03
Copy link
Member

@stweil stweil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you.

@stweil stweil merged commit c72c7d1 into tesseract-ocr:master Nov 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants