DBSCAN

Density-Based Spatial Clustering Application with Noise(DBSCAN) is a Clustering algorithm using Density-based Method. This algorithm uses density of items to assign items to clusters rather than a centroid or single point.


It requires two parameters:

  • : the minimum number of samples that need to be clustered together for an area to be considered high-density.
  • : the distance used to determine if a data point is in the same area as other data points.

Advantages:

  • DBSCAN is good at handling Outliers and is robust to noise.
  • it can create arbitrarily shaped clusters(none-linear and oddly shaped data).
  • It can work well even if the shape and number of clusters is unknown.
  • It can cluster items with varying densities.

Disadvantages:

  • It’s poor at handling lower density data and detecting meaningful clusters, OPTICS is suggested as an alternative.
  • It’s poor at identifying clusters of varying density, HDBSCAN is suggested as an alternative.
  • it require fine-tuning initial parameters to work well.
  • It's not at dealing with high dimensional data.
  • It's more complex and computationally expensive compared to simpler models such as K-Means.