Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Label detection #13502

Open
1 task done
Uddeshya1052 opened this issue Jan 30, 2025 · 6 comments
Open
1 task done

Label detection #13502

Uddeshya1052 opened this issue Jan 30, 2025 · 6 comments
Labels
detect Object Detection issues, PR's question Further information is requested

Comments

@Uddeshya1052
Copy link

Search before asking

Question

I am using YOLO to detect labels and then extract the text within the detected regions. However, I’m facing an issue with background color variations. If the background color of the label changes, the model struggles to detect it. I don’t have enough images with different background colors to train the model.

Would it be a good approach to train the model using grayscale images to generalize for any background color? Or are there alternative techniques or preprocessing steps that could help improve detection robustness in this scenario? Any suggestions or ideas would be greatly appreciated.
Thank you!

Additional

No response

@Uddeshya1052 Uddeshya1052 added the question Further information is requested label Jan 30, 2025
@UltralyticsAssistant UltralyticsAssistant added the detect Object Detection issues, PR's label Jan 30, 2025
@UltralyticsAssistant
Copy link
Member

👋 Hello @Uddeshya1052, thank you for your interest in YOLOv5 🚀! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution.

If this is a custom training ❓ Question, your approach to generalizing detection (e.g., by experimenting with grayscale images or other preprocessing steps) is valid and worth investigating. However, to provide more targeted assistance, please share more details about your dataset, training setup, and any preprocessing techniques you've already tried. Additionally, verify you're considering our Tips for Best Training Results.

For now, here are a few suggestions to improve robustness:

  1. Augmentation Techniques: YOLOv5 already offers powerful augmentation options out of the box. Ensure you are leveraging augmentations like hsv_h, hsv_s, and hsv_v for color variance. You can modify these in the training configuration.
  2. Dataset Expansion Ideas: You might generate synthetic images with varied backgrounds using tools like Albumentations or Photoshop. Adding diverse data can greatly improve generalization.
  3. Grayscale Approach: Converting your dataset to grayscale before training could reduce dependency on color features. Experimenting here could be insightful.

Requirements

Python>=3.8.0 with all requirements.txt installed including PyTorch>=1.8. To get started:

git clone https://github.com/ultralytics/yolov5  # clone
cd yolov5
pip install -r requirements.txt  # install

Environments

YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Status

YOLOv5 CI

If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training, validation, inference, export and benchmarks on macOS, Windows, and Ubuntu every 24 hours and on every commit.

This is an automated response, but no worries 😊—an Ultralytics engineer will assist you further as soon as possible!

@pderrenger
Copy link
Member

@Uddeshya1052 for improved robustness against background variations in YOLOv5, we recommend:

  1. Leveraging YOLOv5's built-in augmentations (--hsv_h, --hsv_s, --hsv_v in train.py) to simulate color variations
  2. Adding background images (0-10% of dataset) per our training tips guide
  3. Generating synthetic training data with varied backgrounds using tools like Photoshop/Python

Grayscale conversion alone typically isn't sufficient. Focus on data diversity through augmentation. If you need more specific guidance, please share your dataset statistics and example training mosaics from runs/train/exp/train_batch*.jpg.

@Uddeshya1052
Copy link
Author

Uddeshya1052 commented Feb 4, 2025

@pderrenger Thank you for your suggestion. I tried these settings, but unfortunately, the performance decreased. I don't have a dataset with different colors. How can I still solve this issue? My application is equipment labeling detection, where labels are pasted on devices with different background colors.

@pderrenger
Copy link
Member

@Uddeshya1052 for equipment label detection with variable backgrounds, consider these steps:

  1. Use controlled color augmentation (reduce HSV gains to 0.1-0.2 range in hyp.yaml)
  2. Generate synthetic label images with Python's PIL/CV2 by pasting labels onto random-colored backgrounds
  3. Implement edge detection preprocessing to focus on label contours

You can test grayscale conversion as a temporary inference preprocessing step without retraining:
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) then stack channels

For specific implementation help, please share sample images from your runs/train/exp/train_batch*.jpg mosaics.

@Uddeshya1052
Copy link
Author

Image
Image

Image

@pderrenger,

Here are a few sample images I’m working with for detection. As you can see, the labels come in different colors, such as yellow and white, and we also have similar variations in other colors.

For training, I have used the following settings in the hyp.yaml file:

fl_gamma: 0.0 # focal loss gamma (EfficientDet default gamma=1.5)
hsv_h: 0.015 # image HSV-Hue augmentation (fraction)
hsv_s: 0.7 # image HSV-Saturation augmentation (fraction)
hsv_v: 0.4 # image HSV-Value augmentation (fraction)
degrees: 0.0 # image rotation (+/- deg)
translate: 0.1 # image translation (+/- fraction)
scale: 0.5 # image scale (+/- gain)
shear: 0.0 # image shear (+/- deg)
perspective: 0.0 # image perspective (+/- fraction), range 0-0.001
flipud: 0.0 # image flip up-down (probability)
fliplr: 0.5 # image flip left-right (probability)
mosaic: 1.0 # image mosaic (probability)
mixup: 0.0 # image mixup (probability)

These are the same settings used in hyp.scratch-low.yaml.
Let me know if you have any suggestions or if you need more details.

@pderrenger
Copy link
Member

Thanks for sharing the samples and settings—try slightly lowering your hsv_s and hsv_v values to moderate color augmentation and consider synthetic background generation to further boost variability, ensuring you're using the latest YOLOv5 release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
detect Object Detection issues, PR's question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants