Ramgopal Prajapat:

Learnings and Views

Search and Relevance Ranking – Simplified

By: Ram on Aug 20, 2022

In the previous blog, we have discussed on role of  relevance and ranking for Ecommerce.  For the same journey, now we will discuss on the developing a model to arriving at relevance ranking.

High Level Plan

  • Search in ECommerce
  • Relevance Ranking
  • Data Set for Relevance Ranking
  • Features for Ranking Model Develop

Product Search

When a customer search for a product on an ecommerce platform, the query is sent for retrieving the products that matches the search terms.  For example, if a customer search using “men shoes casual puma price between 1000 and 2000” search query, the system breaks the query into multiple information such as product category, brand and price range etc and then matches all of these condition to retrieve the product list.  This is a standard information retrieval model, and we will discuss in future blog about the different information retrieval or search approaches.

 

Relevance Ranking

Once the retrieval model finds the matching products, the product relevance ranking model orders the retrieved products in the order of relevance.

Relevance Ranking is critical component of all the modern information retrieval process. A family of Machine Learning algorithms called Learning to Rank are used to solve the product ranking problem for Ecommerce.

For a supervised relevance ranking model, we need the labelled data and set of features to develop Learning to Rank (LTR) Model.  For each of the search query terms, we need to define the ranking of all the matching products. As discussed earlier, there could be done using any of the two approaches – Automated or Manual approach.

Now, we will explain the structure of data required for developing the ranking model.

Data Structure for LTR Model

  1. Target Variable/Label

Relevance Ranking for each of the product ids and for a Search Query Term

  1. Group

Search Query Terms as ranking of products are with respect to each query and same product may appear across search query and the ranking could be very different

  1. Features

All the information available to use as input to rank the products for a search term or query. The features can be created based on user clicks data (e.g., # of clicks, conversions, and Orders etc), Product Attributes (e.g., price, brand, etc) and Customer Feedback (e.g., Reviews and Returns) Data for creating modelling features.

 

A picture containing text, businesscard, screenshot

Description automatically generated

 

Features for Relevance Ranking Model

  1. Conversions - Products that have higher conversions should be on the top. When product was displayed whether customer bought the product in the same session.
  2. Clicks & CTR - When products are display in the product listing page post search, the product with higher clicks can be considered as user interest or relevance. So, the products with higher click through rate (CTR) can be indicative of higher relevance ranking
  3. Returns - If a product even after delivery is getting ordered is returned, this is a challenge as these are adding up to the cost and bad experience. The products with higher returns can be de-prioritized. So, the return information can be considered in the relevance ranking
  4. Pricing & Discounts– Product price ranges can be considered as input factor to the ranking. Prices may impact differently across different customer segments and platforms. For a platform like Amazon, it may be Ok to show the products with higher discount on the top. But some of the other platform such as Tata CLiQ or Myntra, showing niche patterns or products may be more relevant to the customers
  5. Sales & Discounts - During the sales period, the products with higher discounts should be given priority.
  6. Out of Stock & Seller Cancellations - the products with lower level of out of stock and seller cancellations should be given higher priority. Otherwise, we are creating bad customer experience and opportunity loss.
  7. Product Reviews - products with positive reviews indicate customer liking and interest, so be given higher priority. For a niche products or fashion product, the expert review may be incorporated as the customer reviews may not be available.
  8. Product Listing Completeness - product with more and complete information be given higher importance. For the platform like Tata CLiQ, this may not be significant factor as almost all products will be similar complete but can be explored. If it is considered, we need to define this factor as this may not be available directly
  9. Brands - Some of the brands may have higher performance and be given a lot more importance than others. we can create input mechanism to capture relative brand relevance for a period and considered in the framework.

Feature Engineering Dimensions

  • Window - for defining these performance metrics a defined time window is required. This window could be 1 day/1 week/2 weeks/1 month/3 months
  • Trends/Momentum - Growth or decline should be given required attention to ensure the products picking up the sales performance be given

Objective of Relevance Ranking

The objective for search is to show all the relevance results – show all the relevance products for a search query and the performance is measured using Precision and Recall.

Precision for Ecommerce Product search is the proportion of retrieved product are relevant to the customers

Recall is the proportion of relevant products are pulled in by the search query

 

Key objective for the relevance ranking is to show the most relevance product on the top.  And this is measured by

  • Mean average precision (MAP)
  • Normalised discounted cumulative gain (NDCG)
  • Mean Reciprocal Rank (MRR)

We will describe these terms in detail in the subsequent blogs

Concluding Thoughts

In this blog, we have discussed on the approach to prepare dataset for Learn To Rank (LTR) Model. Of course, the personalization in an additional dimension and have not discussed so far in this context and the business priority is an important role in the ranking e.g., business may want to show some brands and higher price products higher to increase business performance.  In the next blog, we will discuss on developing a relevance ranking model.  Same construct can be used across these scenarios.

  • Friend Suggestions in Linkedin or Facebook
  • Ranking of Product Recommendations
  • Ranking Houses in a Racing Tournament

I am very keen to understand from you. How can you apply ranking algorithms in your context?

 

 

 

Leave a comment