frequent itemset generation

Association_rules

This step aims to generate all possible with a $sup \gt threshold$.

brute-force approach

For each possible item-set (called candidate) compute it’s $sup$ by scanning the database.

This, approach which have a complexity of $\mathcal{o}(NMW)$ where:

$N$ -> total number of the transactions
$W$ -> average number of items within a transaction
$M$ -> number of frequent item-set candidate.

it’s extremely computational expensive.

There are other strategies that aims to reduce the computational cost of this operation such as:

reducing the number of candidates by pruning (apriori algorithm)
reducing the number of comparisons $NM$

brute-force approach

The brute-force approach generates each item-set in the graph above. Then, it computes the sup and conf indexes values for every association rule generated by every item-set.

Complexity (EXPENSIVE): O(NWM), where

frequent item-set generation strategies

Reduce the number of candidates M (Apriori Algorithm)
- Complete Search: $M=2^D$
- Use pruning techniques to reduce M
Reduce the number of comparisons NM
- Use efficient data structures to store the candidates or transactions
- No need to match every candidate against every transaction

previous next

frequent itemset generation

brute-force approach

brute-force approach

frequent item-set generation strategies

Link map