By: Ram on Jul 03, 2022

In the previous blog - RFM based Transactional Segmentations for Ecommerce, I have explained the about the RFM Segmentation and its application in Retail/Ecommerce Scenario.

The customer segmentation enables for marketing in reaching out to the customers in a personalized way. They can develop a very focused communication and creatives based on customer profiles of each of these segments. This helps in improving response rates.

The customer segments are created based on Transactional Behaviour (called Transactional Segmentation), Geo Location of the customers (called Geo Segments), Demographic Attributes (called Demographic Segments) and customers' lifestyle/opinions/attitudes (called psychographic segmentation). In this blog, we will focus on the Transactional Behaviour Segmentation.

For understanding customers' transactional behaviour, transactional segmentation is important and RFM is one of the fundamental approaches for segmenting customers based on their transactional behaviour.

RFM stands for (R)ecency, (F)requency, and (M)onetary and it captures customer purchase behaviours related these dimensions.

RFM Methodology is very simple and effective way of developing the customer segmentation and we will explain the steps.

**1.Data**

The source data table had information at a transaction level and gave information on who has bought (customer), what is been purchased (product), when it has been purchased (Order Date), and how much amount paid (monetary value).

- Customer_id : Customer ID
- Order_ID : Order ID or Order Number
- Product_ID : Product ID
- Order_Date : Order Date
- Paid_Amount : Amount paid for each product

From this data, we have aggregated the data and got the data at a customer level - One Customer, One Row. Also, we have taken a random sample of 50k customers.

Now dataset has following attributes.

For the key analysis variables, we need to check the distribution and summary statistics. Also, some of the visualization may help to understand these variables.

**Recency**

For calculating Recency, we can calculate number of days between cut off date and the last purchase date for each of the customers.

Now, we can check the distribution of the days since last purchase, and this can be done by plotting histogram.

Observation: Good number of customers have purchased in the recent period, and this is good as recency of the purchases shows engagement of the customers.

**Frequency**

There are a few customers who made a lot of orders. We can consider these as outliers and group them together. So, any customers with orders more than 10 are marked as 10 orders. Now, we can plot the order frequency

**Monetary Amount**

Spend of customers has a skewed distribution with a positive long tail and few customers have spent less than zero.

- Outlier Treatment – if spend more than 100,000 then consider as 100,000
- Remove customers with negative spend, may be data anomaly or return cases

**2 Define Recency, Frequency and Monetary Scores**

We have looked at the variable distribution and now we need to create groups for each of the recency, frequency and monetary variables. We can consider top 30%, middle 50% and bottom 30% as three groups for each of the variables. But challenge will be for Frequency Variables – Order counts as most of the customers have 1 order. So, I have created more logical segments for each of the variables.

Overall approach adopted is that based on the Recency, Frequency and Monetary features, we will create groups for each of these and follow below steps.

- Create Bins
- Find Counts for each bin
- Combine bins and create few logical groups

**Recency**

We have good counts of customers in each of these groups. If you want to see the visual distribution.

**Frequency**

We can check the key statistics for frequency – order counts per customer

At least 50% of the customers have placed only 1 order. So, we can consider the following grouping of the orders.

**Monetary **

We can again create groups based on the quantile values, I have preferred simple and logical groups.

You can see the almost equal counts in each of the 10 groups.

Having a lot of groups for each of the Recency, Frequency and Monetary variables, can be useful if the customer base is huge and we want to have a sharper focus. Otherwise, we can create 3 groups for each and then also we will have 3*3*3 =27 segments to manage.

R, F and M Scores are combined to get the RFM score. Still there are 27 segments and may be difficult for us to think of clear understanding of these customers. We have combined the similar segments and created 7 key segments.

`ecommerce_df['rfm_group'] = np.where(ecommerce_df['rfm'].astype(str)=='333', 'Super Stars',`

` # Recently Purchased - High Freq or High Monetary Value`

` np.where(ecommerce_df['rfm'].astype(str).str.contains('332|331|323|313'),`

` 'Approaching Stars',`

` # Recently Purchased Low/Medium Freq Or Monetary Val`

` np.where(ecommerce_df['rfm'].astype(str).str.contains('321|322|311|312'),`

` 'Aspiring Stars',`

` # Purchased between 3 to 6 months back but high Freq and Value`

` np.where(ecommerce_df['rfm'].astype(str).str.contains('233|223|232'),`

` 'Low Engaged 1',`

` # Purchased in between 3- 6 months not not top Freq or Value`

` np.where(ecommerce_df['rfm'].astype(str).str.contains('213|212|231|211|221|222'),`

` 'Low Engaged 2',`

` # Lost (1) - not purchased in the latest 6 months but shopped`

` # high with High Freq or Val`

` # Lost (2) - Not purchased in the engagement was low when they purchased`

` np.where(ecommerce_df['rfm'].astype(str).str.contains('132|123|113|133|131'),`

` 'Lost 1','Lost 2'`

` ) `

` )))))`

`ecommerce_df['rfm_group'].value_counts().sort_index()`

We have labelled these customers based on their transactional behaviours. You can give interesting names to these. And there is the distribution of these segments.

Improved visualizations of the segment distributions.

Tags

Most Popular

Jun 18, 2020

Jul 23, 2020

Jun 19, 2020

Jun 19, 2021