-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Training Fails #72
Comments
@agombert Hello! It seems that the error is due to the image source file? Which image format is your using? |
Yes, it's what I saw, and tried to debug with openCV. But that's quite weird as it worked with the same config on a folder I had a month ago, and now not anymore. I'm using |
@agombert Maybe this is due to the update of albumentations? Did you update your albumentations version? |
I'm using the installation from this repo on a conda environment with |
@JulioZhao97 I tested with different versions of albumentations and got the same problem. That's quite weird. I looked into other potential libraries but only albumentations got quite recent updates. I saw this issue that may be related to it, but unsure as I'm not an expert in cv. I can give you a sample of data for you to test id you'd like to test it yourself. |
@agombert Can you use following setting and see whether this problem will be solved? |
@agombert Hello! Could you please provide your sample data? You can upload it here or send it via my email: [email protected] |
Hey @JulioZhao97 👋 I've just sent you the email with a link to the data sample. @hengrui0516, thanks for your help 🙏, I tried to go with the versions you mentioned (even in the extra of the pyproject.toml) but unfortunately it did not work either. |
@agombert I will see to it today |
@agombert It turns out that I can train with your sample data successfully config: task: detect
mode: train
model: yolov10m-doclayout.yaml
data: data.yaml
epochs: 500
time: null
patience: 100
batch: 1
imgsz: 1120
save: true
save_period: 10
val_period: 1
cache: false
device: '3'
workers: 4
project: public_dataset/data
name: yolov10m-doclayout_data_epoch500_imgsz1120_bs1_pretrain_None
exist_ok: false
pretrained: true
optimizer: SGD
verbose: true
seed: 0
deterministic: true
single_cls: false
rect: false
cos_lr: false
close_mosaic: 10
resume: null
amp: true
fraction: 1.0
profile: false
freeze: null
multi_scale: false
overlap_mask: true
mask_ratio: 4
dropout: 0.0
val: true
split: val
save_json: false
save_hybrid: false
conf: null
iou: 0.7
max_det: 300
half: false
dnn: false
plots: true
source: null
vid_stride: 1
stream_buffer: false
visualize: false
augment: false
agnostic_nms: false
classes: null
retina_masks: false
embed: null
show: false
save_frames: false
save_txt: false
save_conf: false
save_crop: false
show_labels: true
show_conf: true
show_boxes: true
line_width: null
format: torchscript
keras: false
optimize: false
int8: false
dynamic: false
simplify: false
opset: null
workspace: 4
nms: false
lr0: 0.02
lrf: 0.01
momentum: 0.9
weight_decay: 0.0005
warmup_epochs: 3.0
warmup_momentum: 0.8
warmup_bias_lr: 0.1
box: 7.5
cls: 0.5
dfl: 1.5
pose: 12.0
kobj: 1.0
label_smoothing: 0.0
nbs: 64
hsv_h: 0.015
hsv_s: 0.7
hsv_v: 0.4
degrees: 0.0
translate: 0.1
scale: 0.5
shear: 0.0
perspective: 0.0
flipud: 0.0
fliplr: 0.5
bgr: 0.0
mosaic: 1.0
mixup: 0.0
copy_paste: 0.0
auto_augment: randaugment
erasing: 0.4
crop_fraction: 1.0
cfg: null
tracker: botsort.yaml
save_dir: public_dataset/data/yolov10m-doclayout_data_epoch500_imgsz1120_bs1_pretrain_None I provide my environment for your reference:
|
Hey @JulioZhao97 thanks for your help. I'll try a couple of things to see if I can handle the problem and let you know asap ! |
Ok @JulioZhao97 I found the problem !! 😌 It was coming from a 🐍 |
Thank you very much for your help ! 🙏 |
Hey got an error in training which is quite weird when doing:
Looks like there is a problem to read the picture with openCV, but I tried with an old dataset I used to train the model and same error. Do you have any idea what's happening ?
Best,
Arnault
The text was updated successfully, but these errors were encountered: