Skip to content

Commit

Permalink
Merge pull request #41 from surfriderfoundationeurope/research/artifi…
Browse files Browse the repository at this point in the history
…cial-image-dataset-generation

Research/artificial image dataset generation
  • Loading branch information
Packophys authored Aug 25, 2023
2 parents 3a2d38c + 0d43040 commit 9251e86
Show file tree
Hide file tree
Showing 54 changed files with 3,339 additions and 187 deletions.
4 changes: 2 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
.vscode/
*.MP4
.MP4
__pycache__/
*.pickle
*.mp4
*.pyc
*.mp4
*.env
scripts/shell_variables.sh
*.png
Expand Down
111 changes: 111 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -203,6 +203,117 @@ The documentation will be serving on http://127.0.0.1:8000/.
- `kappa`: `7` the moving average window. `1` prevents the average, avoid `2` which is ill-defined.
- `tau`: `4` the number of consecutive observations necessary to keep a track. If you increase `skip_frames`, you should lower `tau`.

## Files and Descriptions

### categories_map.py

This file facilitates the mapping between the TACO and PlasticOrigins classes, enhancing data compatibility.

### utils.py

Containing essential functions, this script supports `create_images.py`, `create_images_num.py`, and `create_n_trash.py` by providing necessary functionalities.

### create_images.py

Artificially generates images for object detection by combining TACO dataset objects and background images. It creates a specified number of images with each object pasted on a unique background.

**Input:**
- `--dataset_path`: Path to the TACO dataset
- `--background_dataset_path`: Path to the background images dataset
- `--result_dataset_path`: Path to save the resulting dataset
- `--num_uses_background`: Number of background images used per object

### create_images_num.py

Similar to `create_images.py`, this script artificially generates images for object detection. However, it aims to achieve a more balanced distribution of objects among different classes in the resulting dataset.

**Input:**
- `--dataset_path`: Path to the TACO dataset
- `--background_dataset_path`: Path to the background images dataset
- `--result_dataset_path`: Path to save the resulting dataset
- `--csv_path`: Path to the CSV file containing existing object counts per class

### create_n_trash.py

Generates images for object detection by placing all objects from a TACO image onto a single background image.

**Input:**
- `--dataset_path`: Path to the TACO dataset
- `--background_dataset_path`: Path to the background images dataset
- `--result_dataset_path`: Path to save the resulting dataset

### dataset_analysis.py

Creates CSV and bar chart files indicating the number of objects in each class.

**Input:**
- `--folder_path`: Path to the folder containing image labels
- `--csv_labels_file_name`: Name of the CSV labels file to be generated (default: labels.csv)
- `--png_labels_file_name`: Name of the PNG labels file to be generated (default: labels.png)
- `--results_path`: Path to save the graph and CSV

### dataset_analysis_train.py

Performs the same function as `dataset_analysis.py`.

**Input:**
- `--train_path`: Path to the train.txt file for extracting label files
- `--val_path`: Path to the val.txt file for extracting label files (optional)
- `--csv_labels_file_name`: Name of the CSV labels file to be generated (default: labels.csv)
- `--png_labels_file_name`: Name of the PNG labels file to be generated (default: labels.png)
- `--results_path`: Path to save the graph and CSV

### draw_bbox.py

Draws bounding boxes for objects on an image.

**Input:**
- `--image_path`: Path to the image
- `--annotation_path`: Path to the annotation file (format: class_id, center_x, center_y, w, h)

### draw_bbox_taco.py

Similar to `draw_bbox.py`, but uses JSON annotations.

**Input:**
- `--image_path`: Path to the image
- `--annotation_path`: Path to the JSON annotation (format: bbox: min_x, min_y, w, h in pixels)

### draw_polygons.py

Draws object segments on an image using JSON annotations.

**Input:**
- `--image_path`: Path to the image
- `--annotation_path`: Path to the JSON annotation (format: bbox: min_x, min_y, w, h in pixels)

### extract_imgs_from_videos.py

Extracts images from videos in a specified folder.

**Input:**
- `--num_images`: Number of images to extract from each video (default: 1)
- `--video_dir_path`: Path to the folder containing videos (default: ../background_videos)
- `--save_dir_path`: Path to save extracted images (default: ../extracted_background_images/)

## Usage Instructions

Please follow the input parameters for each script to use them effectively. Make sure to provide the correct paths and files as required. This README serves as a comprehensive guide to understand the functionality of each script and their intended usage within the project.


**Steps for training:**
- Run dataset_analysis_train.py : Get the labels.csv file
- Run extract_images_from_video.py : Get the background images
- Run create_images_num.py: Get the artificial images
- Create the train.txt file:
- rm train.txt
- rm val.txt
- cp ../ref_images/train.txt train.txt
- cp ../ref_images/val.txt val.txt
- ls /datadrive/data/artificial_data/images/| sed "s|^|/datadrive/data/artificial_data/images/|" >> train.txt
- Run the yolov8 training
- yolo task=detect mode=train model=yolov8s.pt data=/datadrive/data/data_gen_images/data.yaml epochs=400 imgsz=640 batch=40 workers=8 project=../../../datadrive/data/hiba name=yolov8_data_gen_40b_400e_eqclasses_600background

## Datasets and Training

Consider other branches for that!
Loading

0 comments on commit 9251e86

Please sign in to comment.