Book contents
- Frontmatter
- Dedication
- Contents
- List of Figures
- List of Tables
- Preface
- Acknowledgments
- 1 Beginning with Machine Learning
- 2 Introduction to Data Mining
- 3 Beginning with Weka and R Language
- 4 Data Preprocessing
- 5 Classification
- 6 Implementing Classification in Weka and R
- 7 Cluster Analysis
- 8 Implementing Clustering with Weka and R
- 9 Association Mining
- 10 Implementing Association Mining with Weka and R
- 11 Web Mining and Search Engines
- 12 Data Warehouse
- 13 Data Warehouse Schema
- 14 Online Analytical Processing
- 15 Big Data and NoSQL
- Index
- Colour Plates
10 - Implementing Association Mining with Weka and R
Published online by Cambridge University Press: 26 April 2019
- Frontmatter
- Dedication
- Contents
- List of Figures
- List of Tables
- Preface
- Acknowledgments
- 1 Beginning with Machine Learning
- 2 Introduction to Data Mining
- 3 Beginning with Weka and R Language
- 4 Data Preprocessing
- 5 Classification
- 6 Implementing Classification in Weka and R
- 7 Cluster Analysis
- 8 Implementing Clustering with Weka and R
- 9 Association Mining
- 10 Implementing Association Mining with Weka and R
- 11 Web Mining and Search Engines
- 12 Data Warehouse
- 13 Data Warehouse Schema
- 14 Online Analytical Processing
- 15 Big Data and NoSQL
- Index
- Colour Plates
Summary
Chapter Objectives
✓ To demonstrate the use of the association mining algorithm.
✓ To apply association mining on numeric data
✓ To comprehend the use of class association rules
✓ To compare the decision tree classifier with association mining
✓ To conduct association mining with R language
Association Mining with Weka
Let us consider the ‘to-play-or-not-to-play’ dataset given in Figure 10.1 for getting hands on experience with association mining in Weka. This dataset is available as default dataset in the data folder of Weka with the file name weather.nominal.arff.
This dataset has four attributes describing weather conditions and a fifth attribute is a class attribute that indicates based on the weather conditions of the day, whether Play was held or not. There are 14 instances, or samples in this dataset.
It is important to note that in classification, we are interested in assigning the output attribute to play or no play. But in Association mining we are interested in finding association rules based on the associations between all the attributes that came together. Thus, in association we do not take class attributes into consideration.
If we compare this dataset with the transactions dataset discussed in the last chapter for market basket analysis, you can find equivalence between transaction id and data items purchased in that transaction.
Here, No. 1 to 14, i.e. the instances act as transaction ids and the values of attributes given in the row corresponding to the given instance are acting as data items for that instance. Here we are interested in finding associations by observing the facts like Outlook = sunny AND Temperature = hot is more common than the association of Outlook = sunny AND Temperature = cooloccurring together as shown in Figure 10.2.
Weka contains an Associate tab which aids in applying different association algorithms in order to find association rules from datasets. One such algorithm is the Predictive Apriori association algorithm that optimally combines support and confidence to calculate a value called predictive accuracy as depicted in Figure 10.3.
The user only needs to specify how many rules they would like the algorithm to generate, and the algorithm takes care of optimizing support and confidence to find the best rules.
- Type
- Chapter
- Information
- Data Mining and Data WarehousingPrinciples and Practical Techniques, pp. 319 - 367Publisher: Cambridge University PressPrint publication year: 2019