tags:
  - AI/Tasks/Clustering
  - AI/ML/UnSupervisedLearning

Clustering

Clustering is a set of technique used for partitioning groups of unlabeled data into a number of groups, or clusters based on similarity of data items.

Info

Clustering is an Unsupervised Learning task.

Clustering helps identify two qualities of data:

Meaningfulness: By identifying groups of similar items and using it to extract similar relations, they can expand domain knowledge. E.g. discovering relationship in features related to patients in cluster with high stroke risk, indicating causes or symptoms.
Usefulness: By finding patterns of items in a cluster, clusters can be targeted for specific actions. E.g. Targeted advertisements for specific groups of customers.

Clustering Methods:

Algorithms:

k-Nearest Neighbors (KNN)
K-Means
DBSCAN
K-Means++
K-Medoids
Mini-Batch K-means
Gaussian mixture model (GMM)
OPTICS
Mean-Shift
BIRCH
CURE
Agglomerative Hierarchical Clustering (AHC)
DIvisive ANAlysis Clustering (DIANA)
CLARANS
CLIQUE
STING
Spectral Clustering
Affinity Propagation Spectral Clustering

Resources:

Google Machine Learning: Clustering