DBSCAN (DENSITY BASED SPATIAL CLUSTERING OF APPLICATION WITH NOISE)
DEFINITIONS
Define as the radius of an hypersphere and a threshold value
-
CORE
a point with points inside is hypersphere of radius
-
BORDER
a point with points inside is hypersphere of radius
-
NEIGHBORHOOD
two points are in neighborhood with each other when they are inside each hypersphere
-
DIRECT DENSITY REACHABILITY
a point is in direct density reachable with a point when is core and is in neighborhood
-
DENSITY REACHABILITY
a point is in density reachable with a point when and are connected by a series of direct density reachable points
-
DENSITY CONNECTION
a point is density connected to point if there is a point such that and are density reachable from
ALGORITHM
Algorithm 1 DBSCAN
Require: SetOfPoints: UNCLASSIfIED points
Require: Eps, MinPts
ClusterId <- nextId(NOISE);
for i = 1 to SetOfPoints.size do
Point <- SetOfPoints.get(i)
if Point.ClId = UNCLASSIfIED then
if ExpandCluster(SetOfPoints, Point, ClusterId, Eps, MinPts) then
ClusterId <- nextId(ClusterId)
Ensure: SetOfPoints
ExpandCluster(SetOfPoints, Point, ClId, Eps, MinPts) : Boolean;
seeds:=SetOfPoints.regionQuery(Point,Eps);
If seeds.size<MinPts THEN # no core point
SetOfPoint.changeClId(Point,NOISE);
RETURN False;
ELSE
# all points in seeds are density-reachable from Point
SetOfPoints.changeClIds(seeds,ClId);
seeds.delete(Point);
WHILE seeds <> Empty DO
currentP := seeds.first();
result := SetOfPoints.regionQuery(currentP,Eps);
If result.size >= MinPts THEN
For i FROM 1 TO result.size DO
resultP := result.get(i);
If resultP.ClId IN {UNCLASSIfIED, NOISE} THEN
If resultP.ClId = UNCLASSIfIED THEN seeds.append(resultP);
END If;
SetOfPoints.changeClId(resultP,ClId);
END If; # UNCLASSIfIED or NOISE
END For;
END If; # result.size >= MinPts
seeds.delete(currentP);
END WHILE; # seeds <> Empty
RETURN True;
END If
END; # ExpandCluster
PARAMETERS TO TUNE
and are the parameter that need to be tuned, a good value for can be where is the number of dimensions