Clustering is a set of technique used for partitioning groups of unlabeled data into a number of groups, or clusters based on similarity of data items.


Clustering is an Unsupervised Machine Learning task.

Clustering helps identify two qualities of data:

  • Meaningfulness: By identifying groups of similar items and using it to extract similar relations, they can expand domain knowledge. E.g. discovering relationship in features related to patients in cluster with high stroke risk, indicating causes or symptoms.
  • Usefulness: By finding patterns of items in a cluster, clusters can be targeted for specific actions. E.g. Targeted advertisements for specific groups of customers.

Clustering Methods: