Skip to content

hiseulgi/medical-leaf-image-classification

Repository files navigation

Medical Leaf Image Classification

Web App Demo

Unofficial implementation of Mengenali Jenis Tanaman Obat Berbasis Pola Citra Daun Dengan Algoritma K-Nearest Neighbors (Recognizing Types of Medicinal Plants Based on Leaf Image Patterns with K-Nearest Neighbors Algorithm). This project is a part of my final project in Image Processing course.

Web App Architecture

Dataset

The dataset used in this project is Medical Leaf Image Dataset.

Training Flow

  1. Import image dataset
  2. Preprocess image dataset
    1. Convert image to grayscale
    2. Median filter
    3. Thresholding (binary)
    4. Morphological operation (erosion)
    5. Invert image
  3. Feature extraction (area, eccentricity, axis length, perimeter)
  4. Feature scaling (min-max normalization)
  5. Model training

Result

Image Preprocessing

Image Preprocessing Result

Feature Extraction

Here is the result of feature extraction from one of the images in the dataset.

{
    "area": array([677952.]),
    "eccentricity": array([0.47129833]),
    "major_axis_length": array([1001.20280051]),
    "minor_axis_length": array([883.03469571]),
    "perimeter": array([3516.70302783])
}

Model Training Result

Here is the training and hyperarameter search result. The kNN model performance was really bad, different from the paper. I think the problem is in the dataset. The dataset used in the paper are only 15 classes, while the dataset used in this project are 30 classes though the dataset used in this project are same with the paper. Besides, the number of images in each class are too few.

Best score: 0.4550463188688445
Best parameter: {'n_neighbors': 7}
Test score: 0.4822888283378747

I also tried to train with different model, but the result was still bad. I think the problem is in the dataset. I will try to train with deep learning in the future.

How to Use

Training Module

  1. Clone this repository
git clone https://github.com/hiseulgi/medical-leaf-image-classification.git
  1. Install dependencies
pip install -r requirements.txt
  1. Download dataset
bash scripts/download_dataset.sh
  1. Train model
python src/train.py

API & Web App Deployment Module

Easy way to deploy this project is using docker. Make sure you have installed docker in your machine.

  1. Clone this repository
git clone https://github.com/hiseulgi/medical-leaf-image-classification.git
  1. Copy .env.example to .env and change the value
cp .env.example .env
  1. Build docker image for first time and run service
# build and run compose for first time
bash scripts/build_docker.sh

# run compose after first time
bash scripts/run_docker.sh
  1. Open and test the service at API docs http://localhost:6969/ API Docs Swagger UI

  2. Open and test the service at Web App http://localhost:8501/ Streamlit Web App

Extra (Deep Learning Model)

According to KNN and other machine learning model result, I think the problem is in the dataset. So, I tried to train with deep learning model. I used MobileNetV3 as the base model and trained with transfer learning. The result was better than KNN and other machine learning model. Deep Learning Model Result Here the training notebook: Medical Leaf Image Classification (Deep Learning)

Future Works

  • Deployment API
  • Web App Deployment (Streamlit / Gradio)
  • Train with Deep Learning