Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support online cls model prediction #769

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

zhangjunlongtech
Copy link

@zhangjunlongtech zhangjunlongtech commented Nov 15, 2024

Thank you for your contribution to the MindOCR repo.
Before submitting this PR, please make sure:

Motivation

Currently mobilenet_v3 classification (CLS) model only supports offline inference with mindspore lite. This PR is adding online mobilenet_v3 classification model inference for text direction classification (CLS) task. Add predict_cls.py for everyone to use.

Test Plan

Running the following command and check the output files under ./inferrence_results

python tools/infer/text/predict_cls.py  --image_dir {path_to_img or dir_to_imgs} --rec_algorithm MV3

We can also run in single image mode by setting --cls_batch_mode False.

python tools/infer/text/predict_cls.py  --image_dir {path_to_img or dir_to_imgs} --rec_algorithm MV3 --cls_batch_mode False

The target classification image is
CRNN_t2

The cls task output should looks like this

mindocr INFO - Init classification model: MV3 --> cls_mobilenet_v3_small_100_model. Model weights loaded from pretrained url
mindocr INFO - num images for cls: 1
mindocr INFO - CLS img idx range: [0, 1)
mindocr INFO - All cls res: [('180', 1.0)]
mindocr INFO - Done! Text angle classification results saved in ./inference_results
mindocr INFO - Time cost: 6.98498272895813, FPS: 0.14316427667805537

@zhangjunlongtech zhangjunlongtech changed the title Add online prediction of text direction classification (CLS) model Support online cls model prediction Nov 15, 2024
@@ -238,6 +238,56 @@ Evaluation of the text spotting inference results on Ascend 910 with MindSpore 2
2. Unless extra inidication, all experiments are run with `--det_limit_type`="min" and `--det_limit_side`=720.
3. SVTR is run in mixed precision mode (amp_level=O2) since it is optimized for O2.

## Text Direction Classification
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这块不用单独呈现,e2e的时候加上就行

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants