DISTANCES

PROPERTIES OF DISTANCES

  • symmetry
  • positive definiteness
  • Triangle inequality

EUCLIDEAN DISTANCE

Where is the number of dimensions (attributes) and and are, respectively, the d-th attributes (components) of data objects p and q. Standardization/Rescaling is necessary if scales differ

MINKOWSKI DISTANCE

generalization of euclidean distance

Where is the number of dimensions (attributes) and and are, respectively, the d-th attributes (components) of data objects p and q. Standardization/Rescaling is necessary if scales differ. is a parameter which is chosen depending on the data set and the application

cases

  • Manhattan distance best at discriminate between 0 distance and near 0 distance

  • Euclidean distance

  • Chebyshev, supremum, norm, norm considers only the feature with the maximum difference

MAHALANOBIS DISTANCE

The Mahlanobis distance between two points p and q decreases if, keeping the same euclidean distance, the segment connecting the points is stretched along a direction of greater variation of data. The distribution is described by the covariance matrix of the data set

PREVIOUS NEXT