IIIT Hyderabad Publications |
|||||||||
|
An Improved Framework for Mining Periodic PatternsAuthor: Vipul Chhabra 2019121001 Date: 2023-11-29 Report no: IIIT/TH/2023/177 Advisor:P Krishna Reddy AbstractData mining is a collection of algorithms for extracting valuable insights from large amounts of data. Data mining based approaches are being employed to improve the performance of decision support systems in the fields of customer relationship management, inventory management, fraud detection, surveillance, and recommendation systems. Pattern mining is an important task of data mining, which involves identifying significant associations in transactional databases. These associations reveal valuable trends and relationships that aid businesses in improving efficiency. The field of pattern mining has started with the model to extract frequent patterns from large transactional data. The model of frequent patterns has become popular and resulted in the development of algorithms to improve the performance of several applications, like recommendation systems. In the literature, encouraged by the potential of pattern mining, active research is going on to investigate new pattern mining models such as periodic, utility, coverage, and correlated patterns. In this thesis, we propose an improved approach to mine periodic patterns from temporal databases. The model of periodic patterns facilitates the discovery of recurring behaviours and trends in the temporal databases. The model of periodic pattern, which we call partial periodic pattern (3P), captures periodic associations subject to periodic support (P S) and inter-arrival time (IAT) constraints. Here, the IAT value is the time difference between the successive occurrence of a pattern. The percentage of occurrences of a pattern in a database that satisfies the user-given maximum IAT constraint is called P S. Overall, all patterns that satisfy user given maximum IAT (maxIAT) and minimum PS (minP S) are called 3P s. In this thesis, we address the following issue of the 3P s model. It was observed that if we set the minP S value too low, the number of 3Ps explodes. On the other hand, if set minP S value high, several interesting 3Ps will be missed. We call this problem as a rare item problem. To address this issue, in this thesis, we have proposed an improved model and present an improved depth-first search algorithm to extract 3Ps. In the proposed model, we introduce a new measure called periodic-conf idence, which satisfies both null-invariant and anti-monotonic properties. We also propose a better depth-first search algorithm to mine 3Ps. The proposed algorithm uses irregularity pruning to reduce the search space and computational cost while maintaining the same amount of information. Through experimental results, we show that the proposed algorithm is efficient and scalable. We illustrate the usefulness of the discovered patterns through case studies on air pollution and traffic congestion databases. Extracting interesting trends from large temporal databases in different domains is an active research area. We hope that the proposed approach could facilitate the development of improved approaches to extract regular trends for fraud detection from sensor (surveillance) databases and loyalty mining in e-commerce systems. Full thesis: pdf Centre for Data Engineering |
||||||||
Copyright © 2009 - IIIT Hyderabad. All Rights Reserved. |