URGENT HELP: Fine-Tuning: SAST Algorithm - Failed Inference #14488
Replies: 3 comments
-
Config File:
|
Beta Was this translation helpful? Give feedback.
-
It seems like you are encountering issues with the inference results after fine-tuning the SAST algorithm for text detection. Below is a detailed explanation and some potential solutions based on the information you provided and related documents. 1. Issues with SAST Inference ResultsThe problems you described include:
Potential Reasons for Poor Inference Results:
2. Suggested Steps to Debug and Improve ResultsHere are some actionable steps to address the issues: Step 1: Verify the Pre-trained ModelMake sure you downloaded and used the correct pre-trained model: wget -P ./pretrain_models/ https://paddleocr.bj.bcebos.com/pretrained/ResNet50_vd_ssld_pretrained.pdparams Ensure that the Step 2: Fine-tuning ConfigurationUpdate the configuration file (
Step 3: Debugging Cross Bounding BoxesCross bounding boxes often indicate issues with the post-processing step, particularly with the NMS (Non-Maximum Suppression) or thresholding values. Try:
Step 4: Evaluate Training Data Quality
Step 5: Monitor Training MetricsInspect your
Step 6: Test with Pre-trained SAST Model Without Fine-tuningRun inference using the default pre-trained SAST model on your test images: ocr = PaddleOCR(use_gpu=True, det_algorithm='SAST', use_angle_cls=True, lang='en') This can help you determine whether the issue lies with your fine-tuning process or the algorithm itself. 3. Fine-tuning Text Recognition with VisionLANRegarding your confusion about VisionLAN:
4. Additional Resources
ConclusionThe poor inference results with SAST are likely due to a mismatch in the pre-trained model, incorrect configuration, or issues with your training data. Start by verifying the pre-trained model and configuration file, then debug your dataset and training process. For VisionLAN, ensure you use the correct pre-trained model and configuration for fine-tuning. Feel free to share additional details (e.g., snippets from your Response generated by feifei-bot | chatgpt-4o-latest |
Beta Was this translation helpful? Give feedback.
-
@GreatV I tried using the ResNet50_vd_ssld_pretrained.pdparams file but the validation dataset is loading forever. |
Beta Was this translation helpful? Give feedback.
-
Dear Paddle Community,
I recently had minor success from training with DB algorithm and results were decent. I tried to do the same for SAST. However the inference results are extremely poor. Both fails to detect bounding boxes and also it draws cross bounding boxes.
Code:
ocr = PaddleOCR(use_gpu=True, det_algorithm='SAST', det_model_dir=r'C:\Users\I011786\Pictures\01_Paddle\04_Paddle_Models\03_detector_pretrained\02_SAST\v4_train\v4_inference', use_angle_cls=True, lang='en')
......
img_path = r"............\Pictures\01_Paddle\03_img_test\custom_img_11.jpg"
result = ocr.ocr(img_path, cls=True)
I have done inference on very simple images that I randomly took. It performs extremely worse on complex images, mostly with cross bounding box. I have varied value of nms_thresh from 0.2 to 0.6 but it does not help. Unsure what went wrong.
train.log
Config file is attached in the comments.
PFA my train log and picture!
I humbly request guidance from @GreatV @WenmuZhou @LDOUBLEV @MissPenguin @tink2123 @UserWangZz and others ........ for guidance.
Beta Was this translation helpful? Give feedback.
All reactions