Skip to content

The main objective of this analysis is to identify products that customers want to purchase, and enables sales and marketing teams to develop more effective product placement, pricing, cross-sell, and up-sell strategies.

Notifications You must be signed in to change notification settings

Wamuza1/Market_Basket_Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 

Repository files navigation

Market_Basket_Analysis

Python

Theory of Apriori Algorithm

Library image

Apriori is a popular algorithm [1] for extracting frequent itemsets with applications in association rule learning. There are three major components of Apriori algorithm:

** Support is generated by Apriori Algorithm, and confidence and lift is generic generated by association_rules.**

  • Support
  • Confidence
  • Lift

Suppose

we have a record of 1 thousand customer transactions, and we want to find the Support, Confidence, and Lift for two items e.g. burgers and ketchup. Out of one thousand transactions, 100 contain ketchup while 150 contain a burger. Out of 150 transactions where a burger is purchased, 50 transactions contain ketchup as well. Using this data, we want to find the support, confidence, and lift.

Support Support refers to the default popularity of an item and can be calculated by finding number of transactions containing a particular item divided by total number of transactions. Suppose we want to find support for item B. This can be calculated as: image

For instance if out of 1000 transactions, 100 transactions contain Ketchup then the support for item Ketchup can be calculated as: image

Confidence

Confidence refers to the likelihood that an item B is also bought if item A is bought. It can be calculated by finding the number of transactions where A and B are bought together, divided by total number of transactions where A is bought. Mathematically, it can be represented as: image

Coming back to our problem, we had 50 transactions where Burger and Ketchup were bought together. While in 150 transactions, burgers are bought. Then we can find likelihood of buying ketchup when a burger is bought can be represented as confidence of Burger -> Ketchup and can be mathematically written as: image

###Lift

Lift(A -> B) refers to the increase in the ratio of sale of B when A is sold. Lift(A –> B) can be calculated by dividing Confidence(A -> B) divided by Support(B). Mathematically it can be represented as: image

Coming back to our Burger and Ketchup problem, the Lift(Burger -> Ketchup) can be calculated as: Lift basically tells us that the likelihood of buying a Burger and Ketchup together is 3.33 times more than the likelihood of just buying the ketchup. A Lift of 1 means there is no association between products A and B. Lift of greater than 1 means products A and B are more likely to be bought together. Finally, Lift of less than 1 refers to the case where two products are unlikely to be bought together. image

assosiation_Rules

image

Numbers of rules image

R

Numers of observations and variables.

image

Data structure.

image

Package ‘arules’ Documentation:

Provides the infrastructure for representing, manipulating and analyzing transaction data and patterns (frequent itemsets and association rules). Also provides C implementations of the association mining algorithms Apriori and Eclat. Hahsler, Gruen and Hornik (2005) doi:10.18637/jss.v014.i15.

https://cran.r-project.org/web/packages/arules/arules.pdf

About

The main objective of this analysis is to identify products that customers want to purchase, and enables sales and marketing teams to develop more effective product placement, pricing, cross-sell, and up-sell strategies.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published