Skip to content

jamesnan/Customer-Segmentation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

Mall Customer Segmentation

Define the problem

Customer segmentation is the process of dividing customers into groups based on common characteristics so companies can market to each group effectively and appropriately. This is the unsupervised clustering problem and five popular algorithms will be presented and compared: KMeans, Hierarchical, Affinity Propagation and DBSCAN.

The data set is taken from Kaggle. It consists customers information related to their age, gender, annual income, and spending score. The spending score is a numeric variable ranging from 1 to 100 and was assigned to customers based on behavior parameters and purchasing data.

Steps to solve the problem:

  1. Importing Libraries
  2. Loading the Dataset
  3. Exploratory Data Analysis
    3.1 Distribution of values in Age , Annual Income and Spending Score
    3.2 Correlations between numerical variables
  4. Clustering]
    4.1 K-Means Clustering
         4.1.1 K-Means 5 Clusters
         4.1.2 K-Means 6 Clusters
    4.2 Fuzzy C-Means Clustering]
    4.3 DBSCAN (Density Based Spatial Clustering of Applications with Noise)
    4.4 Heirarchal (Agglomerative)
    4.5 Affinity Propagation
  5. Comparison and Conclusion

Built with

  • numpy
  • pandas
  • seaborn
  • matplotlib
  • sklearn
  • scipy
  • yellowbrick
  • mpl_toolkits.mplot3d

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published