shakes76 · pri-gression · Oct 29, 2024 · Oct 29, 2024 · Oct 30, 2024 · Oct 30, 2024
diff --git a/.DS_Store b/.DS_Store
diff --git a/.gitignore b/.gitignore
@@ -0,0 +1,5 @@
+# Ignore dataset folder
+ISIC2018/
+
+# Ignore macOS system files
+.DS_Store
diff --git a/README.md b/README.md
@@ -1,19 +1,98 @@
-# Pattern Analysis
-Pattern Analysis of various datasets by COMP3710 students in 2024 at the University of Queensland.
+# Lesion Detection on ISIC Dataset with YOLOv7
 
-We create pattern recognition and image processing library for Tensorflow (TF), PyTorch or JAX.
+## Description:
 
-This library is created and maintained by The University of Queensland [COMP3710](https://my.uq.edu.au/programs-courses/course.html?course_code=comp3710) students.
+This project aims to detect lesions in dermoscopic images from the ISIC 2017/2018 dataset using the YOLOv7 object detection model. The primary goal is to implement a solution that achieves a minimum Intersection Over Union (IoU) of 0.8 on the test set, ensuring reliable detection and localization of lesions within each image. Additionally, the model is expected to achieve a suitable accuracy for lesion classification, enhancing the utility of this approach for real-world applications in skin cancer detection.
 
-The library includes the following implemented in Tensorflow:
-* fractals 
-* recognition problems
+## Problem Statement:
+
+The primary objective of this project is to detect lesions in dermoscopic images from the ISIC 2017/2018 dataset using the YOLOv7 model, aiming for an Intersection Over Union (IoU) of at least 0.8. Accurate lesion detection is essential for early skin cancer diagnosis, especially for aggressive forms like melanoma, where early treatment significantly improves outcomes. Automated detection systems can streamline the diagnostic process, reduce errors, and support dermatologists in more efficient, accurate diagnoses, potentially enhancing patient survival rates.
+
+## Algorithm Explanation:
+
+YOLOv7 is a state-of-the-art, single-stage object detection model that processes entire images in a single forward pass, enabling real-time detection with high accuracy. It achieves this by optimizing network architecture and training strategies, resulting in faster inference speeds and improved precision compared to previous models.
+
+## YOLOv7 Architecture:
+
+![YOLOv7 Architecture](yolov7_architecture.png)
+
+## Dependencies:
+
+> The following libraries and versions are required to run the lesion detection project:
+
+- `torch` (PyTorch): for deep learning model implementation and training
+- `torchvision`: for transformations applied to the images
+- `numpy`: for numerical operations and array manipulation
+- `opencv-python` (cv2): for image processing tasks
+- `Pillow`: for handling image file loading
+- `matplotlib`: for plotting and visualizing results
+
+## How It Works
+
+### Data Preprocessing
+For consistent input dimensions, each image is resized to 640x640 pixels. Data augmentation includes random horizontal and vertical flips, color jitter, and normalization based on ImageNet statistics to improve model generalization. A custom transformation pipeline is applied using PyTorch's `torchvision.transforms`.
+
+### Model Implementation
+The `LesionDetectionModel` class implements the YOLOv7 model for lesion detection. This class loads pre-trained YOLOv7 weights via PyTorch Hub, allowing for efficient and accurate lesion detection on dermoscopic images. The model is loaded onto the specified device (either CPU or GPU) and optimized to use the available hardware resources.
+
+1. **Model Initialization**:
+   The model is initialized with pre-trained YOLOv7 weights, loading it onto the designated device. If the model’s backbone layers are detected, they are frozen to retain learned features and accelerate training by focusing only on the last layers for lesion-specific learning.
+
+2. **Forward Pass**:
+   The `forward` method performs a direct pass through the model, processing each image batch and returning bounding box predictions for lesions. This is done with `torch.no_grad()` to prevent gradient computation, making the inference process faster.
+
+   ```python
+   # Example of a forward pass
+   pred = model.forward(images)
+
+### Training Process
+The training pipeline is built using PyTorch and includes data loading, model optimization, and performance tracking over multiple epochs. Key configurations include 10 epochs, a batch size of 16, and a learning rate of 0.001.
+
+1. **Model and Data Loading**:
+   - The YOLOv7-based `LesionDetectionModel` is loaded with pre-trained weights and set to the available device (GPU or CPU).
+   - Data loaders for the ISIC training and validation datasets are set up with batch sizes for efficient processing.
+
+2. **Training and Validation**:
+   - During each epoch, the model performs a forward pass on the training data, calculates loss using binary cross-entropy (BCEWithLogitsLoss), and optimizes using Adam. A learning rate scheduler adjusts the rate to improve convergence.
+   - Validation loss is computed without gradients to assess performance on unseen data, helping prevent overfitting.
+
+   ![Model Trained but is overfitting](train.png)
+g
+   ```python
+   # Example training and validation process
+   for epoch in range(NUM_EPOCHS):
+       train_loss = train_one_epoch()
+       val_loss = validate()
+
+### Prediction Process
+The prediction pipeline in `predict.py` loads a trained YOLOv7 model to detect lesions in new images. It includes preprocessing, model inference, and result visualization.
+
+1. **Model Loading**:
+   - The `LesionDetectionModel` class loads the model with the trained weights on the specified device (GPU or CPU). The model is set to evaluation mode to prevent gradient computations, optimizing it for inference.
+
+2. **Image Preprocessing**:
+   - Each test image is resized to 640x640, converted to RGB, and normalized. These transformations ensure consistency with the model’s expected input.
+
+3. **Inference**:
+   - The `predict_image` function performs a forward pass, generating bounding box predictions for lesions. Non-maximum suppression is applied to filter out overlapping boxes, retaining only the most confident predictions based on IoU and confidence thresholds.
+
+   ```python
+   detections = predict_image(image_path)
+
+## Code Comments and Usage Documentation
+
+### Usage
+To run the training and prediction scripts, follow these instructions:
+
+1. **Training**: Use `training.py` to train the lesion detection model. Ensure you have specified the correct paths for data directories and adjust hyperparameters as needed.
+   ```bash
+   python training.py --data_dir path/to/data --epochs 10 --batch_size 16
+
+2. **Predictiom**: Use predict.py to run inference with the trained model. Make sure to specify the path to the saved model weights.
+   ```python
+   python predict.py --model_path path/to/model_checkpoint.pth
+
+## References
+
+- "Skin Cancer Detection Using Convolutional Neural Networks: A Systematic Review," *National Center for Biotechnology Information (NCBI)*, https://pmc.ncbi.nlm.nih.gov/articles/PMC9324455/
 
-In the recognition folder, you will find many recognition problems solved including:
-* segmentation
-* classification
-* graph neural networks
-* StyleGAN
-* Stable diffusion
-* transformers
-etc.
diff --git a/dataset.py b/dataset.py
@@ -0,0 +1,138 @@
+import os
+from torch.utils.data import Dataset
+import torchvision.transforms as transforms
+from PIL import Image
+import torch
+import numpy as np
+import cv2
+
+class ISICDataset(Dataset):
+    def __init__(self, img_dir, annot_dir, mode='train', transform=None, img_size=640, model_output_grid_size=80):
+        """
+        Initializes the ISICDataset.
+        """
+        print("Initializing ISICDataset...")
+
+        self.img_dir = img_dir
+        self.annot_dir = annot_dir if mode == 'train' else None
+        self.mode = mode
+        self.img_size = img_size
+        self.transform = transform if transform else self.default_transforms()
+        self.num_anchors = 3
+        self.grid_size = model_output_grid_size  # This should be the same as the model's output grid size (80 in your case)
+
+        # Get list of image files
+        print("Loading image files...")
+        self.img_files = sorted([f for f in os.listdir(img_dir) if f.endswith('.jpg')])
+
+        if self.mode == 'train':
+            # Filtering only those images that have corresponding annotation files
+            print("Filtering images with corresponding annotations...")
+            annot_files = set([f.replace('_segmentation.png', '') for f in os.listdir(annot_dir) if f.endswith('.png')])
+            self.img_files = [f for f in self.img_files if f.replace('.jpg', '') in annot_files]
+
+        # Safeguard against empty dataset
+        if not self.img_files:
+            raise ValueError(f"No valid images found in {img_dir} with corresponding annotations in {annot_dir}")
+
+        print(f"Dataset initialized with {len(self.img_files)} images.")
+
+    def default_transforms(self):
+        print("Setting default image transformations...")
+        return transforms.Compose([
+            transforms.Resize((self.img_size, self.img_size)),
+            transforms.RandomHorizontalFlip(),
+            transforms.RandomVerticalFlip(),
+            transforms.ToTensor(),
+            transforms.ColorJitter(brightness=0.2, contrast=0.2, saturation=0.2, hue=0.1),
+            transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
+        ])
+
+    def __len__(self):
+        return len(self.img_files)
+
+    def __getitem__(self, idx):
+        print(f"Getting item {idx}...")
+        img_path = os.path.join(self.img_dir, self.img_files[idx])
+        print(f"Loading image from: {img_path}")
+
+        try:
+            image = Image.open(img_path).convert("RGB")
+        except Exception as e:
+            print(f"Error loading image {img_path}: {e}")
+            # In case of error, create a dummy image to keep format consistent
+            image = torch.zeros((3, self.img_size, self.img_size))
+
+        if self.transform:
+            image = self.transform(image)
+
+        if self.mode == 'train':
+            annot_filename = self.img_files[idx].replace('.jpg', '_segmentation.png')
+            annot_path = os.path.join(self.annot_dir, annot_filename)
+            print(f"Loading annotation from: {annot_path}")
+
+            if not os.path.exists(annot_path):
+                print("Annotation file not found, creating dummy target.")
+                return image, torch.zeros((self.num_anchors, self.grid_size, self.grid_size, 85))
+
+            try:
+                mask = Image.open(annot_path).convert("L")
+            except Exception as e:
+                print(f"Error loading annotation {annot_path}: {e}")
+                return image, torch.zeros((self.num_anchors, self.grid_size, self.grid_size, 85))
+
+            mask = mask.resize((self.img_size, self.img_size))
+            print("Annotation loaded and resized.")
+
+            # Convert mask to numpy array and extract bounding boxes
+            boxes = self.mask_to_bounding_boxes(mask)
+
+            # Create a target tensor of size (num_anchors, grid_size, grid_size, 85) and populate it
+            target_tensor = torch.zeros((self.num_anchors, self.grid_size, self.grid_size, 85))
+
+            # Iterate over the bounding boxes and assign them to the appropriate grid cells and anchors
+            img_width, img_height = mask.size
+            for box in boxes:
+                x_min, y_min, x_max, y_max = box
+                # Calculate grid cell positions
+                grid_x = int((x_min + x_max) / 2 / img_width * self.grid_size)
+                grid_y = int((y_min + y_max) / 2 / img_height * self.grid_size)
+
+                # Ensure the grid coordinates are within bounds
+                grid_x = min(max(grid_x, 0), self.grid_size - 1)
+                grid_y = min(max(grid_y, 0), self.grid_size - 1)
+
+                # Convert box to YOLO format
+                x_center, y_center, width, height = self.convert_to_yolo_format(box, img_width, img_height)
+
+                # Assign to target tensor - in this case, using the first anchor (anchor 0)
+                target_tensor[0, grid_y, grid_x, 0:4] = torch.tensor([x_center, y_center, width, height])
+                target_tensor[0, grid_y, grid_x, 4] = 1.0  # Objectness score
+                # Set the class label - assuming one class for skin lesions
+                target_tensor[0, grid_y, grid_x, 5:] = torch.zeros(80)
+
+            return image, target_tensor
+        else:
+            # Return a dummy target for validation/test to ensure consistent return format
+            dummy_target = torch.zeros((self.num_anchors, self.grid_size, self.grid_size, 85))
+            return image, dummy_target
+
+    def convert_to_yolo_format(self, bbox, img_width, img_height):
+        x_min, y_min, x_max, y_max = bbox
+        x_center = (x_min + x_max) / 2.0 / img_width
+        y_center = (y_min + y_max) / 2.0 / img_height
+        width = (x_max - x_min) / img_width
+        height = (y_max - y_min) / img_height
+        return x_center, y_center, width, height
+
+    def mask_to_bounding_boxes(self, mask):
+        mask_np = np.array(mask)
+        boxes = []
+
+        contours, _ = cv2.findContours(mask_np, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
+        for contour in contours:
+            x, y, w, h = cv2.boundingRect(contour)
+            if w > 0 and h > 0:  # Ensure valid bounding box
+                boxes.append([x, y, x + w, y + h])
+
+        return boxes
diff --git a/modules.py b/modules.py
@@ -0,0 +1,63 @@
+import torch
+import torch.nn as nn
+
+class LesionDetectionModel(nn.Module):
+    def __init__(self, model_weights='yolov7.pt', device='cpu'):
+        """
+        Initializes the YOLOv7 model for lesion detection using PyTorch Hub with additional dropout layers.
+
+        Parameters:
+            model_weights (str): Path to the pre-trained YOLOv7 weights.
+            device (str): Device to load the model on ('cuda' or 'cpu').
+        """
+        super(LesionDetectionModel, self).__init__()
+
+        self.device = torch.device('cuda' if device == 'cuda' and torch.cuda.is_available() else 'cpu')
+
+        # Load the YOLO model without the autoShape wrapper to get direct access to its layers
+        self.model = torch.hub.load('WongKinYiu/yolov7', 'custom', model_weights, source='github', autoshape=False)
+        self.model.to(self.device)
+
+        # Attempt to freeze backbone layers if they exist in the model
+        if hasattr(self.model, 'backbone'):
+            for param in self.model.backbone.parameters():
+                param.requires_grad = False
+
+        # Add dropout after certain layers
+        self.dropout = nn.Dropout(p=0.2)  # Example of a dropout layer with 50% probability
+
+    def forward(self, images):
+        """
+        Performs a forward pass through the model.
+
+        Parameters:
+            images (torch.Tensor): Batch of images to process.
+
+        Returns:
+            torch.Tensor: Model output with predictions for each bounding box.
+        """
+        images = images.to(self.device)
+
+        # Perform a forward pass through the original model
+        x = self.model(images)[0]
+
+        # Apply dropout before returning output
+        x = self.dropout(x)
+
+        return x
+
+    def detect(self, images, conf_thres=0.25, iou_thres=0.8):
+        """
+        Runs detection on input images with specified thresholds.
+
+        Parameters:
+            images (torch.Tensor): Batch of images to process.
+            conf_thres (float): Confidence threshold for predictions.
+            iou_thres (float): IoU threshold for non-max suppression.
+
+        Returns:
+            list of torch.Tensor: Bounding boxes and labels for detected lesions.
+        """
+        pred = self.forward(images)
+        detections = non_max_suppression(pred, conf_thres, iou_thres)
+        return detections