tags: [AI/Tasks/Clustering, AI/Algorithms/K-Means]
K-Means is an Unsupervised Machine Learning algorithm for Clustering which groups unlabeled data into
K-Means is trained on a dataset of
K-Means assumes that the closes the data points are, the more similar they are. Often Euclidean distance is used to measure this distance.
Algorithm:
Select number of clusters as
Initialize cluster centroids as
Repeat until convergence(When centroids don’t move in last two iteration), or a maximum number of iterations has been reached:
Measure distance between points and centroids. for every
Assign each data item(
Identify center points of each cluster. for every
ℹ️ This will move each cluster centroid(
Advantages:
Disadvantages:
Applications:
To identify optimum number of clusters following methods are used:
Resources: