Implementation of Apriori Algorithm in Python
I learned about Market-Basket Analysis and Apriori in the course SDSC 3001 (Big Data: The Arts and Science of Scaling) and SDSC 3002 (Data Mining) taught by Dr.Yu Yang at CityU. In the first assignment, we were asked to implement Apriori on a transaction dataset. A basic overview of the task is as follows:
First Task: Implement the Apriori algorithm to find all frequent patterns under different settings of the minimum frequency (minimum support/#transactions). Vary the minimum frequency minFreq as 0.0001, 0.0002, 0.0003, 0.0004 and 0.0005. Report the number of frequent patterns, as well as the number of size-K frequent patterns for each size with at least one frequent pattern, under each setting of minFreq.
Second Task: Try to optimize your algorithm using acceleration techniques. Try to make your algorithm finish computing for the task in (1) within 10 mins. Explain each specific acceleration you adopt by providing a running time comparison between adopting the acceleration and not adopting the acceleration.