distances

properties of distances

euclidean distance

$$ \sqrt{\sum_{d=1}^{D}{(p_{d}-q_{d})^2}} $$

Where $D$ is the number of dimensions (attributes) and $p_{d}$ and $q_{d}$ are, respectively, the d-th attributes (components) of data objects p and q. Standardization/Rescaling is necessary if scales differ

minkowski distance $l_{r}$

generalization of euclidean distance

$$ (\sum_{d=1}^{D}{|p_{d}-q_{d}|^r})^{\frac{1}{r}} $$

Where $D$ is the number of dimensions (attributes) and $p_{d}$ and $q_{d}$ are, respectively, the d-th attributes (components) of data objects p and q. Standardization/Rescaling is necessary if scales differ. $r$ is a parameter which is chosen depending on the data set and the application

cases

mahalanobis distance

The Mahlanobis distance between two points p and q decreases if, keeping the same euclidean distance, the segment connecting the points is stretched along a direction of greater variation of data. The distribution is described by the covariance matrix of the data set

$$ \sqrt{(p-q)\sum^{-1}{(p-q)^T}} $$