Ramgopal Prajapat:

Learnings and Views

Apriori algorithm for Market Basket Analysis Explained with Example

By: Ram on Nov 11, 2022

Market Basket Analysis is a common Machine Learning Application in Retail and Ecommerce. In the previous blog, we have explained the applications and role of Market Basket Analysis (MBA.

In this blog, our plan is to explain Apriori Algorithm steps with an example.  The algorithm has two steps 1) Candidate Generation and 2) Pruning or Rejecting candidates. The candidates here are the list of valid combinations.

For a retail scenario, the focus is to find the combinations of products in a basket or the products those are bought together. And the candidates are nothing but product combinations.

Scenario

Below is the list of 5 transactions or orders and we will use this as an example to show the steps of Apriori Algorithm.

As a process, first identify the candidates and the candidate can be 1 product at a time, 2 product combinations, 3 product combinations and so on.

1 Product Candidates

List of products purchased across these transactions

Single Item Candidates

Eggs

Coke

Beer

Bread

Milk

Diapers

 

Scan all the transactions and find the list of products purchased and then scan to find the frequency/count (Support) of these products.

Now, we apply a limit on minimum support (or count) required for any candidate and assume it to be 3.  Only 4 products are meeting the support 3 requited, so the pruned list of candidates are only 4 and this list will be used for identifying next set of combinations.

2 Product Combinations

Based on Apriori Principle, only Frequent itemsets are generated that satisfy support condition. Or all in the infrequent itemsets are ignored. Based on the 1 candidate only 4 products – Beer, Bread, Milk and Diapers satisfy minimum support of 3. So, 2 product combinations are generated for these 4 products only.

2 Item Candidates

Beer, Bread

Beer, Milk

Beer, Diapers

Bread, Milk

Bread, Diapers

Milk, Diapers

 

Now, the transactions are scanned to find the support for these 4C2 (6) combinations.

Again, same rule is applied to prune the candidates based on the minimum support required for the candidates.

Only products in these 4 candidate combinations are used for candidate generation for 3 product combinations. And the process continues.

 

Effectiveness of the Apriori Algorithm

If we do not used the apriori principal, we will consider all the combinations are each step and the total candidates will be for the 6 products are

As a result of Apriori Principal based Pruning, we will have following candidates.

As a result, a net 68% reduction in the number of candidates itemset even in this simple example

References

https://www-users.cse.umn.edu/~kumar001/dmbook/ch5_association_analysis.pdf

Leave a comment