Ramgopal Prajapat:

Learnings and Views

RFM based Transactional Segmentations for Ecommerce

By: Ram on Jun 18, 2022


Retail and Financial Services companies are early adopters of Analytics, Data Science and Machine Learning.

One of the key challenges faced by the retailers are capturing the customer data. Tesco launched loyalty schemes – Clubcard – to link customer purchases and rewarded customers with reward points for showing the card at the time of checkout.

Ecommerce Retailers have advantages of linking customer purchases using registration information such as mobile number or email ID.

Ecommerce Retailers have treasure trove of data about their customers and some of the key data sources are:

  • Purchase – Purchase data such as order value, count, product category, product price etc
  • Browsing - Clickstream data of customer clicks and events
  • Interactions – Customers contact information – calls, emails, social media etc
  • CRM – Outbound communications via email, SMS, whatsapp etc to the customers and customer responses
  • Operational Data – Order cancellations, returns, order delivery journey data (packaging, delivery handover etc etc.)
  • Inventory Data - availability, stocking, and replenishment details
  • Products - product features and tags


These data drives AI engine to power Personalization, Recommendation, Search, and various other AI Use-cases.

One of the fundamental use-case for Ecommerce is Transactional Segmentation that segment customers into smaller groups. Based on the specific attributes of these segments, more targeted offer, communication/messaging, and calls to action is/are developed.


Approach – RFM based Transactional Segmentation

Using Transactional Data, the customers are segmented into smaller groups or segments. This approach is also called Customer Behaviour Segmentation.  In this segmentation scheme, the behaviour is transactional or purchase behaviour of the customers.

The purchase Behaviours is one of the most important in ecommerce industry as this is true testimony of relevance and satisfaction for customers.

Some of the key behavioural segments identified by the transactional segmentations are:

  • Repeat and High Spenders
  • Occasional shoppers
  • Categorical Shoppers – only buying specific category products
  • Disengaged and lost

To find customers aligned to each of these behaviours, we need to prepare the features accordingly and then use relevant ML techniques to create the segments.  We will start with one of the commonly used and simple approach – RFM (Recency, Frequency and Monetary) based Segmentation. And in the next blog, we will use the same data and use K Means Clustering approach for the behavioural segmentation.



We have a list of customers who have been purchasing across periods. Now, we need to pick up all customers who have purchased up to that date (can be called cut-off date).  If we buy all customers who have ever purchased, there will be a lot of customers who may be already lost, or the behaviour may not relevant for the current context.  So, we have considered the customers who have at least one transaction (or order) in the last 12 months.

The list of attributes available are:

  • Customer_id
  • Order ID
  • Product ID
  • Order Date
  • Paid Amount

Now, we will define the approach of RFM Segmentation

  • Recency (R): If customer has bought recently, it indicates they are active with the ecommerce platform. This behaviour is measured using “Recency” and it tells us how recently customers have made their purchases.
  • Frequency (F): if a customer regularly shops, it has high chances of shopping again. So, the repeat behaviour of customer is measured using Frequency and it tells us how often customers have made their purchases.
  • Monetary (M): Another customer behaviour is high spends. Some customers may not be purchasing frequently but when buys, they buy with high amount. How much money do they shop?


For calculating “Recency”, we can find the days between reference date and latest or recent order date for a customer.

Below is visual depiction of the steps to calculate Recency value.  We need to find the most recent order date and then find days between this date and reference date (e.g., 1-Jun-22).

A picture containing timeline

Description automatically generated

Distribution of customers by different number of days is


Description automatically generated


Now, we need to calculate frequency of orders for each of the customers. This is count of orders (distinct order id).

And the distribution of customers for different frequency buckets is as follows. We can see significantly higher volume of customers have purchased only once.


Description automatically generated


Total spend for a customer across orders and products on each of the orders.  We can calculate the distribution of amount spent by the customers.  Typically, on ecommerce platform we can see MRP and list price (selling price) for each of the products, and then additional promotional discount is applied.


Selling Price

Price Paid by the customers

We can consider the final amount paid for each of the orders and then add across orders for each of the customers.


Description automatically generated


RFM Scores and Segmentation

If we have a large volume of segments, it may be difficult to manage and take actions for each of these segments. For creating manageable segments, we create 3 levels for each of these dimensions – Recency, Frequency and Monetary values.

When creating these scores, we should give attention to % of population in each of these buckets, logical & easy cut off limits and business considerations

Recency Score

Recency is one of the most important factors and we have decided to bucket into less than 3 months, 3-6 months and over 6 months.

A picture containing funnel chart

Description automatically generated

Frequency Score

If we look at the distribution, most of the customers have purchased only once.  Just to ensure we have enough % in each of the buckets, we have created 3 buckets as below.


Description automatically generated with medium confidence

Monetary Score

We will notice that we have tried to create 3 buckets such that the limits are easy to relate to and ensure good distribution of customers across these buckets.


Description automatically generated

Now, we need to summarize these segments and bring these segments to life for the business team.

Graphical user interface, application

Description automatically generated

We have created these segments and now, we need to define the engagement strategies for each of these customers. We need to think and consider following dimensions

  • What should be engagement theme?
  • What type of offer construct will be relevant?
  • Which category product can be promoted in the communication?

Leave a comment