association rules
They are rules that describes situation where the presence of a given element ${A}$ or a combination of elements ${A,B}$ assure the presence of a third element ${C}$, they are based on statistics.
definitions
- ITEMSET–> A collection of one or more items.
- K-ITEMSET –> An itemset that contains k items.
- SUPPORT COUNT $\sigma$ –> Frequency of occurrence of an itemset.
- SUPPORT –> Fraction of transactions that contain an itemset.
- FREQUENT ITEMSET –> An itemset whose support is greater than or equal to a minsup threshold.
Association rules can be described by the form
$$ A \rightarrow C \space where \space A,C \in itemset $$
$A$ is called antecedent and $C$ is called consequent
metrics
support $sup$
the fraction of transaction that contains both $A$ and $C$
$$ sup = \frac{(A,C)}{N} $$
confidence $conf$
the number of times $C$ appears over transactions that contains $A$
$$ conf = \frac{(A,C)}{A} $$
confidence from support
confidence can also be computed from supports as
$$ conf = \frac{(A,C)}{A} =\frac{\frac{(A,C)}{N}}{\frac{A}{N}} = \frac{sup(A,C)}{sup(A)} $$
support measures “how much” an occurrence can be considered a rule (there must be enough transaction cases), a rule with low support can be generated by random associations
confidence measures how much a rule is represented in the transactions that contains it