Frequent-Itemsets

This folder file contains the Apriori algorithm and SON algorithm that uses to discover all frenquent itemsets.

Part I - Apriori:

[Input]
The input file is a single line of nested JSON array, within the array, each basket is represented as a JSON array of integers representing item numbers. A sample file is as follows:

[[1, 2], [1, 2, 3], [1, 3, 4], [2, 3, 4], [3, 4]] # 5 baskets

[Output]
Print out candidates and frequent itemsets in each pass, each per line as a SORTED list, until C(k) or L(k) is empty. An example is as follows:

C1: [[1], [2], [3], [4], [5], [6], [7], [8], [9], [10], [11], [12]]
L1: [[1], [2], [3], [4], [5], [6]]
C2: [[1, 2], [1, 3], [1, 4], [1, 5], [1, 6], [2, 3], [2, 4], [2, 5], [2, 6], [3, 4], [3, 5], [3, 6], [4, 5], [4, 6], [5, 6]]
L2: [[1, 2], [1, 3], [2, 3], [2, 6], [3, 5], [4, 5]]
C3: [[1, 2, 3]]
L3: []

Part II - SON Algorithm:

[Input]
Input of this phase include two parts, one is chunks of baskets the same as phase 1 input, the other is the candidates found in phase 1.

[Output]
One line per each global frequent itemset and its global count, in the following format:
[[1, 2], 18]
[[1, 3], 20]
[[1], 31]
[[2], 22]
[[3], 22]
[[4], 12]
[[6], 15]
[[2, 3], 13]

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
Sample		Sample
README.md		README.md
apriori.py		apriori.py
son-phase1.py		son-phase1.py
son-phase2.py		son-phase2.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sample

Sample

README.md

README.md

apriori.py

apriori.py

son-phase1.py

son-phase1.py

son-phase2.py

son-phase2.py

Repository files navigation

Frequent-Itemsets

About

Releases

Packages

Languages

xoxoxoxooxoxoxox/Frequent-Itemsets

Folders and files

Latest commit

History

Repository files navigation

Frequent-Itemsets

About

Resources

Stars

Watchers

Forks

Languages