K-Means groups datapoints into distinct groups, called clusters. Each cluster is defined by its centroid . We optimize the clusters using ๐ Expectation Maximization where we treat the cluster assignment as our hidden state .
We implicitly minimize the loss function
where when belongs in cluster ; in other words, we minimize the distance between each point and the centroid of the cluster it belongs to. This can also be interpreted as reconstruction loss if we approximate points by their centroid.
Info
For a softer clustering, we can turn to ๐ผ Gaussian Mixture Models that model the probability of belonging to cluster rather than a simple or .
E-step finds the best for each point, assigning each point to its closest centroid. M-step finds the best for each cluster, recalculating the centroid to be the average of the clusterโs points.