Unofficial implementation of Mengenali Jenis Tanaman Obat Berbasis Pola Citra Daun Dengan Algoritma K-Nearest Neighbors (Recognizing Types of Medicinal Plants Based on Leaf Image Patterns with K-Nearest Neighbors Algorithm). This project is a part of my final project in Image Processing course.
The dataset used in this project is Medical Leaf Image Dataset.
- Import image dataset
- Preprocess image dataset
- Convert image to grayscale
- Median filter
- Thresholding (binary)
- Morphological operation (erosion)
- Invert image
- Feature extraction (area, eccentricity, axis length, perimeter)
- Feature scaling (min-max normalization)
- Model training
Here is the result of feature extraction from one of the images in the dataset.
{
"area": array([677952.]),
"eccentricity": array([0.47129833]),
"major_axis_length": array([1001.20280051]),
"minor_axis_length": array([883.03469571]),
"perimeter": array([3516.70302783])
}
Here is the training and hyperarameter search result. The kNN model performance was really bad, different from the paper. I think the problem is in the dataset. The dataset used in the paper are only 15 classes, while the dataset used in this project are 30 classes though the dataset used in this project are same with the paper. Besides, the number of images in each class are too few.
Best score: 0.4550463188688445
Best parameter: {'n_neighbors': 7}
Test score: 0.4822888283378747
I also tried to train with different model, but the result was still bad. I think the problem is in the dataset. I will try to train with deep learning in the future.
- Clone this repository
git clone https://github.com/hiseulgi/medical-leaf-image-classification.git
- Install dependencies
pip install -r requirements.txt
- Download dataset
bash scripts/download_dataset.sh
- Train model
python src/train.py
Easy way to deploy this project is using docker. Make sure you have installed docker in your machine.
- Clone this repository
git clone https://github.com/hiseulgi/medical-leaf-image-classification.git
- Copy
.env.example
to.env
and change the value
cp .env.example .env
- Build docker image for first time and run service
# build and run compose for first time
bash scripts/build_docker.sh
# run compose after first time
bash scripts/run_docker.sh
According to KNN and other machine learning model result, I think the problem is in the dataset. So, I tried to train with deep learning model. I used MobileNetV3 as the base model and trained with transfer learning. The result was better than KNN and other machine learning model. Here the training notebook: Medical Leaf Image Classification (Deep Learning)
- Deployment API
- Web App Deployment (Streamlit / Gradio)
- Train with Deep Learning