DISTANCES
PROPERTIES OF DISTANCES
- symmetry
- positive definiteness
- Triangle inequality
EUCLIDEAN DISTANCE
Where is the number of dimensions (attributes) and and are, respectively, the d-th attributes (components) of data objects p and q. Standardization/Rescaling is necessary if scales differ
MINKOWSKI DISTANCE
generalization of euclidean distance
Where is the number of dimensions (attributes) and and are, respectively, the d-th attributes (components) of data objects p and q. Standardization/Rescaling is necessary if scales differ. is a parameter which is chosen depending on the data set and the application
cases
-
Manhattan distance best at discriminate between 0 distance and near 0 distance
-
Euclidean distance
-
Chebyshev, supremum, norm, norm considers only the feature with the maximum difference
MAHALANOBIS DISTANCE
The Mahlanobis distance between two points p and q decreases if, keeping the same euclidean distance, the segment connecting the points is stretched along a direction of greater variation of data. The distribution is described by the covariance matrix of the data set