SIMILARITY AND DISSIMILARITY

SIMILARITY

numerical measure of how alike two data objects are, higher when objects are more alike ( range of )

DISSIMILARITY

numerical measure of how different two data objects are, lower when objects are more alike ( minimum value upper bound varies )

ATTRIBUTE TYPEDISSIMILARITYSIMILARITY
NOMINAL if and viceversa if and viceversa
ORDINAL
INTERVAL

PROPERTIES OF SIMILARITY

SIMILARITY BETWEEN VECTORS

SIMPLE MATCHING COEFFICIENT

the ratio between the number of matches and the number of attributes

JACCARD COEFFICIENT

the ratio between the number of matches and the number of non attributes

COSINE COEFFICIENT

the cosine between the vectors

EXTENDED JACCARD COEFFICIENT TANIMOTO

the jaccard coefficient for continuous attributes

PREVIOUS NEXT