Market Basket Analysis in R and Microsoft Azure

Market Basket Analysis can help retailers identify what consumer trends are and allows them to predict their behavior. In this article, we’ll discuss how we used this technique and applied it to the restaurant industry using Microsoft Azure.

Market Basket Analysis (MBA) is widely used by retail and other industries to derive associations between products through analyzing the frequency in which they are bought together. The data is used to analyze trends and predict what customers are inclined to buy or like. From there, MBA can be used to improve their marketing strategy and focus further on specific customer needs.

In our recent hackathon, we took up a business case to see how we can apply Market Basket Analysis to predict patterns for a client in the restaurant industry with Microsoft Azure. They are looking to offer promotions to their regular customers not only as a reward, but also to introduce options. A few parameters for consideration were:

  • Identification of loyal customers

  • Find associations using Market Basket Analysis

  • Determine if the customers are buying the ‘extras’ that would be predicted

  • If not, offer a special deal or coupon

  • Check back to see if they converted

To read more about MBA, or to get an introduction to statistical terms used below, visit our blog post!

Goal

Smartbridge has already developed Market Basket Analysis capabilities using Tibco Spotfire. We wanted to expand and develop our technical skills using newer and more flexible techniques with R and Microsoft Azure, including Machine Learning Studio and Databricks. In addition, we also wanted to explore visualizations of the statistical output using Power BI and Microstrategy.

Hear our podcast episode about hackathons:

Hackathons for an innovative culture

Approach

The initial step was to get the data set required for the analysis and to setup the environment. For experimental purposes and to keep it general, we decided to start with a sample grocery data set.

We built the code in R to transform our data into a market basket format. There are 2 types of format, basket and item. We used the basket format where each line in the data file represents the transaction without the need to have a transaction number. When using item format, we would need to group the rows for transactions.

Below are the steps for generating the association rules in R.

  • Install the aRules library

  • Create a dataset called mba_arules using the apriori function

  • Set the parameters of min support of 0.001, min confidence of 0.75

  • Set maximum length (number of items in one list) of 3. It was easier to explain output that contained fewer choices

Market Basket Azure Snapshot
Snapshot of the association rules with R
  • Display the top 10 association rules

Market Basket Azure Display Rules
Display association rules

The first rule in the screenshot shows that a customer is 11 times as likely to buy bottled beer if he/she purchases liquor and red/blush wine. The confidence level of this rule was 90%.

We also used Azure Databricks to run the R program. Below are the steps:

  • Create an Azure Databricks resource and a cluster

  • Create a notebook with R language and associate the cluster created in the previous step

  • Install aRules library from the CRAN library source

  • Execute the R script

Next Steps

Now that we have the association rules built for the grocery data, we are planning to apply this to restaurant data.  Check out our follow up blog post that gives an insight on the output using trends from Power BI as well as visualizations from MicroStrategy dossier.

Looking for more on business intelligence?

Explore more insights and expertise at smartbridge.com/businessintelligence

There’s more to explore at Smartbridge.com!

Sign up to be notified when we publish articles, news, videos and more!