Source code for my undergraduate thesis — Region Grouping in East Java based on Person with Social Welfare Problems using Self-Organizing Maps Algorithm and K-Nearest Neighbors Missing-Value Imputation.
- Git
- A *.csv file to be clustered
- Conda (This project using Conda as an environment)
- A cup of coffee ☕
-
Clone this repository into your machine
git clone https://github.com/desenfirman/som-clustering-knn-imputation.git cd som-clustering-knn-imputation
-
Set up conda environment for this project.
conda env create -f environment.yml
-
Wait for download and installation package completed. Drink-your-coffee. . . ☕
-
After installation completed, run this command to start a Flask webserver.
python runwebserver.py
-
Access localhost:8000 to your browser and you're ready to use this app.
-
Open localhost:8000 from your browser
-
Select your *.csv file that you want to be clustered.
-
Input a algorithm parameter. In this app you need to input following parameter:
K = Don't use KNN or use KNN with K = 1 till 7 (recommended value) Alpha = 0.1 till 1 (recommended value) Eta = 0.1 till 1 (recommended value) Epoch = minimum 30 is recommended Neuron Size = 3x3, 4x4, 5x5 etc
-
After all parameter input is filled, click 'Mulai Clustering' to start clustering process.
As you can see, the app show clustering progress and report alongside cluster visualization from epoch through epoch.
When clustering process is complete, you can see overall Silhouette Coefficient alongside with all member Silhouette Coefficient.
I don't built a webserver, built an array transformation algorithm or any code that doesn't relevant in my undergraduate thesis from scratch. You can check environment.yml
to see what packages I used for this project.