Ramgopal Prajapat:

Learnings and Views

Step by Step Learning to Rank Model Development for Ecommerce

By: Ram on Sep 18, 2022

Summary

In this blog, we will describe application of Learning to Rank Algorithm for Ecommerce Search. The learning to rank algorithms are relevant for several scenarios in Ecommerce e.g., ranking product for search, ranking product recommendations or showing the most relevant products on category or brand pages. We will describe about the requirements, data, and methodology.

High Level Plan

  • Context and Problem Statement
  • Target Variable Definition
  • Data & Features
  • Learning To Rank Algorithm & ML Model
  • Model Validation and Deployment
  • Conclusion

 

Context

Search is one of the important customer journeys on the ecommerce platform, typically, 30-40% of the sales are initiated from search journey. When a customer search for any product, there can be thousands of products meeting the search query terms, but the list must be prioritized further based on customer interests. The objective for this prioritization and ranking of the products is to improve business metrics such as Click Through Rate (CTR) and Conversion Rate (CR).

Business Objective – improve CTR for the search results

Machine Learning Objective – Develop a Machine Learning model using Learning to Rank Algorithms to demonstrate the business results and show improvement on ranking model metrics such as MRR, NDCG.   

Note: The learning to rank algorithms are relevant for several scenarios in Ecommerce e.g., ranking product for search, ranking product recommendations or showing the most relevant products on category or brand pages.

Approach

As we any other Machine Learning Model, we need to create structure for Target Variable (e.g., Label), define the data requirements, and develop model architecture components.

  1. Target or Label Variable Definitions

Defining Label data is one of the biggest challenges in Learning to Rank Scenario as no existing label data available. For this scenarios, Sales Value, Order Counts and CTR can be used as input metrics for defining the product ranking. Based on the business focus and alignment a few additional factors can also be considered such as conversion rate. Once we have these input factors, we assign different weights to these factors based on business context or judgement. This will give us an approach to create ranking of the products for each of the search terms

 

Chart, funnel chart

Description automatically generated

  1. Performance and Observation Periods

Performance Period is window to define and measure performance of the model. We have defined the factors and weights for the input factors for the target variable in the above step. Based on the context different number of days or weeks can be considered for considering the period to take data for these factors. For our scenario, we have considered 1 week period for finding CTR, Sales Value, Order Counts.

 

Observation Period is the days/weeks considered calculating all the features to be used in the model

For the features calculations, we have considered the latest 90 days period for transactions and 30 days for clicks data.

 

  1. Data and Features

The model is developed at a product level for each of the search query. The data for sales (e.g., value or order counts), clicks (e.g., views, add to cart) and product features (e.g., brand, colour etc.) at search term and product level are used. Sales data for the last 90 days and clicks behaviour for the latest 30 days are used for creating the features. Also, some of the additional features were created based on the ratios.

 

Also, considering different scale of measurement, we have a normalized the features using minmax scaler function. Also, some of the search query may be more frequent and products associated will have a higher scale, we can scale/normalize features with each of the search terms separately.  

 

  1. Training and Test Samples

This is an interesting scenario where you can not use random sampling for creating the training and testing data samples. As the ranking of the products are linked to a search query, hence we have taken 80% of the search queries for the model training.

 

  1. Learning to Ranking (LETOR) Algorithms

In regression and classification algorithms, the focus is to predict outcome for one data point. In this scenario, we need to predict ranks for a set of products for a given query or search query. So, it is an example of ranking items based on a list of attributes or features.

 

XGBoost algorithm can also be used for ranking scenarios, and it offers 3 objective functions for ranking a set of products. These objective functions are: Pointwise, Pairwise, and Listwise and the technical details of these can be searched or we will provide these in the future blog.

 

For this scenario, we have used Pairwise objective function. In this, for a given query, it predicts order for a pair of two products first and then repeat the process across all the pairs before finalising the ranks of all the products for the search query.  

 

The search term groups are created first and this will be additional parameters in addition to features and labels

 

groups = df.groupby('searched_term').size().to_frame('size')['size'].to_numpy()

 

model2 = xgb.XGBRanker( 

    booster='gbtree',

    objective='rank:pairwise',

    random_state=42,

    learning_rate=0.1,

    colsample_bytree=0.9,

    eta=0.05,

    max_depth=6,

    n_estimators=110,

    subsample=0.75

    )

 

model2.fit(X_train, y_train, group=groups, verbose=True)

 

  1. Model Performance Metrics

Model performance was evaluated using Normalized Discounted Cumulative Gain (NDCG) and Mean Reciprocal Ranking (MRR) in addition to Precision and Recall.  

We have also checked the list of important features along with their importance scores.

 

Key features came highly relevant were sales value and orders count in the latest 15 days along with a few ration features.

 

NDCG across both training and testing samples were around 96% that is good.

 

  1. Deployment Approach and A/B Testing

We wanted to test the newly developed machine learning based Learning to Rank model in the real scenario. We developed end to end data pipeline in the production before deploying this model. This data pipeline will run everyday and prepare the data in the required shape and format for the model to run. 

 

 

Once a version of this model is deployed, we validated the effectiveness of the model by using A/B Testing framework.

 

To test the model for live system, 5% of the traffic can be diverted to new model-based product lists for the search journey and remaining 95% of the users on the existing product lists. Based on the one week of the data and analysis, the decision can be taken to increase the traffic. Gradually and in a calibrated way, the traffic can be diverted to new model over a period.

 

Conclusions

When a structured approach is used for developing a Machine learning Model for search query-based product ranking, it delivers the business impact. Typically, we can see an improvement of 10-15% in the business and ML Metrics by deploying the ML Models

Some of the key learnings

  • Defining target variable appropriately is one the key steps. We have observed data anomalies and hence data treatments played important role.
  • As always, the feature engineering is important. We have created a few ratios and other features that has played critical role in improving last lag of performance.
  • For some of queries, the data were not enough, and performance was not up to the mark. This requires further calibrations and enhancements.
  • And last but least, usage of ML model is a journey, and first version may not be perfect but a great starting point.

 

Hope this scenario was useful in understanding application of Learning to Rank model in a real life. Please share your comments and questions.

Leave a comment