Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

det 检测模型微调偶发异常:list index out of range #14390

Closed
3 tasks done
kerry-weic opened this issue Dec 16, 2024 · 3 comments
Closed
3 tasks done

det 检测模型微调偶发异常:list index out of range #14390

kerry-weic opened this issue Dec 16, 2024 · 3 comments

Comments

@kerry-weic
Copy link

🔎 Search before asking

  • I have searched the PaddleOCR Docs and found no similar bug report.
  • I have searched the PaddleOCR Issues and found no similar bug report.
  • I have searched the PaddleOCR Discussions and found no similar bug report.

🐛 Bug (问题描述)

错误信息如下图:
image

list index out of range 错误出现14次,看错误信息好像是数据问题,如上图其中一条数据如下:

[2024/12/14 11:36:49] ppocr ERROR: When parsing line train/fe9ed4f7921749e09d76202cf9a00285_1.jpg	[{"transcription": "正常支付租金。", "points": [[306.0, 510.0], [917.0, 548.0], [910.0, 663.0], [299.0, 625.0]]}, {"transcription": "期开业阶段的租金、收银系统使用费、推广服务费及其它各项费用仍须自装修期届", "points": [[287.0, 1039.0], [3484.0, 1108.0], [3480.0, 1248.0], [283.0, 1179.0]]}, {"transcription": "满后的次日起照常缴纳。", "points": [[310.0, 1232.0], [1257.0, 1268.0], [1253.0, 1374.0], [305.0, 1337.0]]}, {"transcription": "4.租金", "points": [[299.0, 1410.0], [649.0, 1424.0], [645.0, 1535.0], [294.0, 1521.0]]}, {"transcription": "4.1.双方同意按“基础租金+提成租金”的模式计算租金。", "points": [[302.0, 1616.0], [2501.0, 1656.0], [2499.0, 1762.0], [300.0, 1721.0]]}, {"transcription": "4.2.基础租金部分", "points": [[303.0, 1788.0], [1035.0, 1806.0], [1032.0, 1912.0], [300.0, 1893.0]]}, {"transcription": "4.2.1.每月基础租金为:人民币(大写)柒拾陆万伍仟肆佰拾贰元", "points": [[302.0, 1966.0], [2983.0, 1995.0], [2982.0, 2100.0], [301.0, 2072.0]]}, {"transcription": "(¥765432元)。其他费用¥100元", "points": [[307.0, 2138.0], [1762.0, 2150.0], [1761.0, 2256.0], [306.0, 2244.0]]}, {"transcription": "4.2.2.付款方式:", "points": [[302.0, 2311.0], [1000.0, 2317.0], [999.0, 2422.0], [301.0, 2416.0]]}, {"transcription": "按月支付:每月提前至少10天支付下月基础租金。", "points": [[308.0, 2477.0], [2264.0, 2477.0], [2264.0, 2599.0], [308.0, 2599.0]]}, {"transcription": "4.3.提成租金部分", "points": [[318.0, 2657.0], [1044.0, 2657.0], [1044.0, 2761.0], [318.0, 2761.0]]}, {"transcription": "4.3.1.计算方式:租金=承租方收入*提成比例。", "points": [[307.0, 2833.0], [2113.0, 2822.0], [2114.0, 2927.0], [307.0, 2939.0]]}, {"transcription": "4. 3. 2.提成比例:16‱()。", "points": [[307.0, 2999.0], [1536.0, 2993.0], [1537.0, 3105.0], [307.0, 3111.0]]}, {"transcription": "4.3.3.承租方收入", "points": [[312.0, 3178.0], [1050.0, 3166.0], [1052.0, 3271.0], [314.0, 3284.0]]}, {"transcription": "承租方收入是指:结算周期内的全部销售收入,不扣除其他成本;", "points": [[317.0, 3335.0], [2805.0, 3306.0], [2807.0, 3424.0], [318.0, 3452.0]]}, {"transcription": "承租方开设于其他地址的店铺收入不予计算。", "points": [[323.0, 3504.0], [2027.0, 3475.0], [2029.0, 3594.0], [325.0, 3623.0]]}, {"transcription": "4.3.4.结算周期:1个月。", "points": [[317.0, 3677.0], [1322.0, 3654.0], [1325.0, 3770.0], [320.0, 3794.0]]}, {"transcription": "5.租金支付", "points": [[316.0, 4025.0], [864.0, 3999.0], [870.0, 4120.0], [322.0, 4146.0]]}, {"transcription": "5.1.首期费用", "points": [[322.0, 4217.0], [877.0, 4192.0], [882.0, 4309.0], [327.0, 4335.0]]}, {"transcription": "首期费用对应租赁期限:年月日至年月日", "points": [[328.0, 4379.0], [2179.0, 4315.0], [2183.0, 4432.0], [332.0, 4496.0]]}, {"transcription": "5.2.之后费用支付", "points": [[322.0, 4735.0], [1041.0, 4692.0], [1048.0, 4798.0], [328.0, 4841.0]]}, {"transcription": "3.3.1.装修期仅作为乙方完成装修的期限要求;装修期作为租赁期限的一部分仍需", "points": [[291.0, 328.0], [3525.0, 404.0], [3525.0, 569.0], [299.0, 486.0]]}, {"transcription": "3.3.2. 乙方必须于装修期届满之日前完成装修工程并开业。乙方如有合理特殊情况", "points": [[302.0, 700.0], [3494.0, 758.0], [3490.0, 905.0], [302.0, 843.0]]}, {"transcription": "不能如期开业,须提前至少7日书面告知甲方且须获甲方书面同意延期开业,但延", "points": [[293.0, 866.0], [3507.0, 937.0], [3503.0, 1071.0], [293.0, 1025.0]]}, {"transcription": "4.3.5.结算付款方式:每个结算周期结束后10个工作日内,双方结算支付租金。", "points": [[332.0, 3815.0], [3428.0, 3815.0], [3431.0, 3953.0], [332.0, 3962.0]]}, {"transcription": "首期费用:共计人民币¥152元,", "points": [[327.0, 4559.0], [1616.0, 4501.0], [1607.0, 4618.0], [339.0, 4672.0]]}, {"transcription": "包含上述期限内的租金与租赁配套服务费用", "points": [[1628.0, 4484.0], [3348.0, 4542.0], [3365.0, 4718.0], [1632.0, 4626.0]]}, {"transcription": "租金与配套服务费用应按月结算支付,", "points": [[331.0, 4889.0], [1766.0, 4810.0], [1779.0, 4931.0], [339.0, 5011.0]]}, {"transcription": "应于每月10日前结算并支付。", "points": [[1804.0, 4781.0], [2980.0, 4898.0], [2975.0, 5036.0], [1804.0, 4919.0]]}]

🏃‍♂️ Environment (运行环境)

Python 3.8.15
paddlepaddle-gpu 2.6.1.post117

paddleocr 版本main分支
image

🌰 Minimal Reproducible Example (最小可复现问题的Demo)

python3 -m paddle.distributed.launch --gpus '0'  tools/train.py -c configs/det/ch_PP-OCRv4/ch_PP-OCRv4_det_teacher.yml -o Global.pretrained_model=./pretrained_model/ch_PP-OCRv4_det_server_train/best_accuracy
@GreatV
Copy link
Collaborator

GreatV commented Dec 16, 2024

能提供一个最小可复现的数据集,我们debug一下吗

@kerry-weic
Copy link
Author

现在训练是在docker里面,我先拉到本地debug试试,单看上面报错的数据好像没发现问题

@jingsongliujing
Copy link
Collaborator

可以看一下是不是数据集格式的问题,与可以成功读取训练的数据比对一下,格式上的异同,由于没有可复现的数据集,我们没法进行排查

@jingsongliujing jingsongliujing converted this issue into discussion #14504 Jan 7, 2025

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Projects
None yet
Development

No branches or pull requests

3 participants