Implementing Association Mining with Weka and R

Parteek Bhatia

doi:10.1017/9781108635592.011

Chapter Objectives

✓ To demonstrate the use of the association mining algorithm.

✓ To apply association mining on numeric data

✓ To comprehend the use of class association rules

✓ To compare the decision tree classifier with association mining

✓ To conduct association mining with R language

Association Mining with Weka

Let us consider the ‘to-play-or-not-to-play’ dataset given in Figure 10.1 for getting hands on experience with association mining in Weka. This dataset is available as default dataset in the data folder of Weka with the file name weather.nominal.arff.

This dataset has four attributes describing weather conditions and a fifth attribute is a class attribute that indicates based on the weather conditions of the day, whether Play was held or not. There are 14 instances, or samples in this dataset.

It is important to note that in classification, we are interested in assigning the output attribute to play or no play. But in Association mining we are interested in finding association rules based on the associations between all the attributes that came together. Thus, in association we do not take class attributes into consideration.

If we compare this dataset with the transactions dataset discussed in the last chapter for market basket analysis, you can find equivalence between transaction id and data items purchased in that transaction.

Here, No. 1 to 14, i.e. the instances act as transaction ids and the values of attributes given in the row corresponding to the given instance are acting as data items for that instance. Here we are interested in finding associations by observing the facts like Outlook = sunny AND Temperature = hot is more common than the association of Outlook = sunny AND Temperature = cooloccurring together as shown in Figure 10.2.

Weka contains an Associate tab which aids in applying different association algorithms in order to find association rules from datasets. One such algorithm is the Predictive Apriori association algorithm that optimally combines support and confidence to calculate a value called predictive accuracy as depicted in Figure 10.3.

The user only needs to specify how many rules they would like the algorithm to generate, and the algorithm takes care of optimizing support and confidence to find the best rules.

Book contents

10 - Implementing Association Mining with Weka and R

Summary

Access options

Book contents

10 - Implementing Association Mining with Weka and R

Summary

Access options

Save book to Kindle

Save book to Dropbox

Save book to Google Drive