Application of Association Rule Mining for Discovering Purchasing Patterns in Retail Datasets
Main Article Content
Abstract
The exponential growth of transactional data in the retail sector presents significant opportunities for businesses to optimize their operations and understand consumer behavior. However, extracting actionable insights from massive, unstructured transaction logs remains a persistent analytical challenge. This paper applies Association Rule Mining (ARM), specifically the Apriori algorithm, to discover hidden purchasing patterns within the UCI Online Retail II dataset, a publicly available benchmark containing over one million transaction records from a UK-based non-store online retailer. Utilizing a Python-based data mining framework built upon the mlxtend library, the study processes a stratified sample of 10,000 customer transactions to generate robust association rules evaluated through the classical metrics of Support, Confidence, and Lift. The methodology involves a systematic five-stage data preprocessing pipeline—including cancellation removal, non-product filtering, null elimination, deduplication, and binary transaction encoding—followed by the algorithmic extraction of frequent itemsets and rule generation. The experimental results yield 247 frequent itemsets and 189 association rules satisfying the configured thresholds (minimum support = 2%, minimum confidence = 75%). Among these, 38 rules exhibit a Lift value exceeding 2.0, indicating statistically meaningful associations beyond random co-occurrence. The highest-ranked rule, PARTY BUNTING → PAPER CHAIN KIT 50’S CHRISTMAS, achieves a Lift of 4.21 and a Confidence of 86%, demonstrating a strong directional purchasing dependency. These findings provide data-driven recommendations for inventory synchronization, strategic product placement, and targeted promotional bundling in retail environments. The study concludes that classical ARM techniques, despite the advent of deep learning approaches, remain highly interpretable and practically relevant tools for retail analytics.