Skip to content

Latest commit

 

History

History
216 lines (120 loc) · 11 KB

README.md

File metadata and controls

216 lines (120 loc) · 11 KB

Customer Segmentation

Image

Problem Statement:

Context : Customer Personality Analysis is a detailed analysis of a company’s ideal customers. It helps a business to better understand its customers and makes it easier for them to modify products according to the specific needs, behaviors and concerns of different types of customers.

Customer personality analysis helps a business to modify its product based on its target customers from different types of customer segments. For example, instead of spending money to market a new product to every customer in the company’s database, a company can analyze which customer segment is most likely to buy the product and then market the product only on that particular segment.

Task: Need to perform clustering to summarize customer segments.

Data preprocessing & Feature Engineering

  1. Missing values were imputed through median.
  2. Created Age feature from customer's birth date.
  3. Created Days_Since_Customer to demonstrate the number of days since the customer enrolled
  4. Created Num_Kids & Fam_size for family size analysis.
  5. Created Num_accepted which accounts for all the accepted offers by the customers from the previous 5 campaigns
  6. MntTotal : Total amount spent by a customer in the last two years

Exploratory Data Analysis

{Keeping response(i.e. if customer accepted the offer in the last campaign) as the target column : Reason behind this is to figure out why/what makes customers accept our last campaign offer}

  1. Box-plot of all the features image We will not remove the outliers as Response column has imbalanced data distribution. We can loose information corresponding to the case when customer has accepted the offer in the last campaign

  2. Relation Between Education, Income & Response

image

image

  1. Does Recency affect Responses? (Recency is the Number of days since customer's last purchase)

    image

  2. How have customers spent in the last two years?

    image

    image

  3. Does Family Size affect purchases?

    image

    image

  4. Purchases made at stores/website/catalog/discount purchases

image

  1. Checking for patterns in purchases made through store, website and catalog image

  2. Older people have more complaints?/ How have people with complaints reacted towards last campaign?

    image

  3. Regular customers seem to be happy

image

  1. How do features correlate ?

    image

Correlation Coefficients for response:
image

Conclusions after EDA:

1) Education and Income :

We observed that customers with only basic education tend to have lower income levels. Interestingly, higher income customers show a higher probability of accepting offers. This suggests that marketing efforts might need to be tailored differently for varying income groups, possibly offering more appealing incentives to lower-income customers to increase their acceptance rates

2) Recency and Offer Acceptance:

Our data indicates that customers who have not made a purchase recently are less likely to accept new campaign offers. This highlights the need for continuous engagement strategies. To address this, we could implement re-engagement campaigns targeted at these customers to remind them of our products and services

3) Spending and Campaign Acceptance:

Our analysis shows that customers who have spent more in the past two years are more likely to accept new campaign offers. This suggests that high spenders value our products, and we should consider offering them personalized incentives or loyalty programs to maintain their engagement and encourage further spending

4) Family Size and Campaign Response:

Interestingly, smaller families are more responsive to campaign offers and tend to spend more. This insight can help us design family-size-specific campaigns. For larger families, we might need to offer bundled products or family-oriented discounts to increase their spending and campaign acceptance.

5) Purchase Channel and Campaign Acceptance:

Our data reveals that customers who make fewer purchases online or through catalogs are less likely to accept campaign offers. This indicates a need to enhance our online and catalog engagement strategies, perhaps by improving the user experience on these platforms or offering exclusive online deals to boost their interest and participation

6) Age, Complaints, and Campaign Participation

We found that older customers tend to lodge more complaints. However, this does not affect their participation in campaigns. This highlights an opportunity to improve our customer service for older demographics, ensuring their issues are addressed promptly while maintaining their engagement with our campaigns.

7) Historical Acceptance and Current Campaign Response

Customers who have historically accepted more offers are more likely to accept the current campaign. This suggests that our loyal customers are consistently engaged. We should leverage this by targeting these frequent acceptors with exclusive offers and early access to new products to maintain their loyalty and encourage continued participation

Dimensionality Reduction

image

We initially reduced the 22 features to 2 principal components using PCA to visualize the data.

image

Upon visualizing the data in 2D, we observed that it was challenging to differentiate between the entries with response 0 and 1. So we proceeded to 3 n_components.

image

We performed clustering, with 3 components using K-means clustering . Here is the Elbow curve:

image

We visualized the clusters in a 3D plot using the first three principal components."

newplot (6)

The 3D visualization helped us understand the separation and distribution of the clusters better.

Realizing that 2 components only explained a small portion of the variance, we increased the number of components to 3, which improved the explained variance but was still not enough. Eventually, we increased it to 7 components, capturing 72% of the total variance

image

image

Visualising the clusters:

  1. Income V/s Clusters:

image

  1. Age V/s Clusters:

image

  1. Money Spent V/s Clusters

image

  1. Total Purchase Number V/s Clusters

image

  1. Family Size V/s Clusters

image

  1. Number of Discounted Purchases V/s Clusters

image

  1. Previous Accepted Offers

image

  1. Number of days since customers enrolled

image

Summary Of clustering:

Cluster 0 :

*Comprises of 46% customers (Largest pool)
*Least Income
*Least money spent
*Least Number of purchases
*Large Family size

Cluster 0 is our largest customer segment, making up 46% of our customer base. These customers generally have the lowest income and spending, along with the least number of purchases. They also tend to have larger families. To better engage this segment, we should consider offering budget-friendly promotions and products that cater to larger families

Cluster 1:

*Oldest Cluster
*low income + low spending
*Long time customer
*Largest Family size
*Highest number of purchases with discount

Cluster 1 consists of our oldest customers who have been with us for a long time. They have low income and spending but are very loyal and frequently use discounts. This cluster also has the largest family sizes. To retain their loyalty, we should continue providing targeted discounts and special offers, emphasizing our appreciation for their long-term support

Cluster 2:

*High income
*High number of purchases
*Low family size
*Doesnt utilise discounts much

Cluster 2 includes customers with high income who make frequent purchases. They generally have smaller families and don't utilize discounts much. This indicates that they value quality and exclusivity over discounts. To engage this segment, we should focus on offering premium products and services, possibly through a loyalty program that provides exclusive benefits

Cluster 3:

*Smallest pool (only 9%)
*Highest Income and spending
*Most purchases
*Low family size
*Regular customers

Cluster 3 is our smallest segment, comprising only 9% of our customer base, but they are our most valuable customers. They have the highest income and spending levels and make the most purchases. These customers are regular shoppers with small family sizes. To retain and reward these high-value customers, we should implement personalized marketing strategies and VIP programs that offer exclusive benefits and experiences.