IIIT Hyderabad Publications |
|||||||||
|
Empirical Evaluation of Quantitative Association RulesAuthor: Anura Mohania Date: 2020-05-14 Report no: IIIT/TH/2020/34 Advisor:Kamalakar Karlapalem AbstractThe goal of data mining is to discover knowledge and review new, interesting and previously unknown information to the user. One of the important data mining techniques is association rules. Let I= {i1, i2, i3, … in} be a set of n binary attributes called items. Let D= {t1, t2, t3, … tm} be a set of transactions that form a database where each transaction in D has a unique transaction ID and contains a subset of the items in I. A set of items that occur together is an itemsets and a frequent itemset is an itemset whose support is greater than some user-specified minimum support. For event X and Y, an association rule is a rule of type X =>Y, where X, Y I are set of values such that XY=with certain support and confidence. Support determines how often a rule is applicable to a given dataset, while confidence determines how frequently items in Y appear in transactions that contain X. Classical use of association rules is with market-basket data resulting in rules such as “People who buy beer also buy diapers with confidence 0.67 and support 0.4”. Association rules discover patterns and correlations that may be buried deep inside a database. Most of the association rules are based on categorical data- where notions of items and transactions are very clear. For quantitative data with many such attributes, it is not clear to define items and transactions. Therefore, an empirical study is conducted to comprehend how quantitative data can be mapped to items and transactions to get association rules. From this study, we can have an approach towards a range of possibilities for discovering quantitative association rules. The key concept in this work is to give labels to ranges of quantitative data and the use of a combination of labels to decide on items and transactions. We conducted a large number of experiments on three different datasets to present varying results and understanding of quantitative association rules, and issues therein. Full thesis: pdf Centre for Data Engineering |
||||||||
Copyright © 2009 - IIIT Hyderabad. All Rights Reserved. |