Clustering is grouping collections of unlabeled data into a number of clusters based on similarity of data items.
Clustering Methods
Centroid-based Clustering: Is a non-hierarchical clustering method where centroids for a specific number of clusters is defined and distance to it is used to group data items.
Distribution-based Clustering: This method is used in data which is composed of distributions; where the distance from the distribution's center indicates the probability of item belonging to the distribution.
Density Method: It identifies and groups data points in areas of high concentrations together, assuming that they have more similarities and differences than points in a lower dense region.
This method can take advantage of Kernel Density Estimation(KDE), also called Probability Density Function(PDF), to estimate the underlying distribution of data.
✔️ this method has a good accuracy
✔️ It has the ability to merge clusters
✔️ Creates arbitrary-shaped distributions for dense areas