KMeans vs. Gaussian Mixture Models

Addressing the major limitation of KMeans.

Mar 29, 2024

I like to think of Gaussian Mixture Models as a more generalized version of KMeans.

While we have covered all the necessary conceptual and practical details here: Gaussian Mixture Models…

…let me tell you some of the widely known limitations of KMeans that you might not be aware of.

Limitations of KMeans

To begin:

It can only produce globular clusters. For instance, as shown below, even if the data has non-circular clusters, it still produces round clusters.

It performs a hard assignment. There are no probabilistic estimates of each data point belonging to each cluster.

It only relies on distance-based measures to assign data points to clusters.
- To understand better, consider two clusters in 2D — A and B. Cluster A has a higher spread than B.
- Now consider a line that is mid-way between centroids of A and B.
- Although A has a higher spread, even if a point is slightly right to the midline, it will get assigned to cluster B.
- Ideally, however, cluster A should have had a larger area of influence.

Gaussian Mixture Models

These limitations often make KMeans a non-ideal choice for clustering.

Gaussian Mixture Models are often a superior algorithm in this respect.

As the name suggests, they can cluster a dataset that has a mixture of many Gaussian distributions.

They can be thought of as a more flexible twin of KMeans.

The primary difference is that:

KMeans learns centroids.
Gaussian mixture models learn a distribution.

For instance, in 2 dimensions:

KMeans can only create circular clusters
GMM can create oval-shaped clusters.

This is illustrated in the animation below:

The effectiveness of GMMs over KMeans is evident from the image below.

KMeans just relies on distance and ignores the distribution of each cluster.
GMM learns the distribution and produces better clustering.

If you want to get into more details, I covered them in-depth here: Gaussian Mixture Models.

It covers:

What is the motivation and intuition behind GMMs?
The end-to-end mathematical formulation of GMMs.
How to use Expectation-Maximization to model data using GMMs?
Coding a GMM from scratch (no sklearn).
Comparing results of GMMs with KMeans.
How to determine the optimal number of clusters for GMMs?
Some practical use cases of GMMs.
Takeaways.

👉 Over to you: What are some other shortcomings of KMeans?

Thanks for reading!

Whenever you are ready, here’s one more way I can help you:

Every week, I publish 1-2 in-depth deep dives (typically 20+ mins long). Here are some of the latest ones that you will surely like:

I want to read full articles.

To receive all full articles and support the Daily Dose of Data Science, consider subscribing:

I want to read full articles.

👉 If you love reading this newsletter, feel free to share it with friends!

Share Daily Dose of Data Science

👉 Tell the world what makes this newsletter special for you by leaving a review here :)

Review Daily Dose of Data Science

Discussion about this post

No posts

#nojs-banner { position: fixed; bottom: 0; left: 0; padding: 16px 16px 16px 32px; width: 100%; box-sizing: border-box; background: red; color: white; font-family: -apple-system, "Segoe UI", Roboto, Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol"; font-size: 13px; line-height: 13px; } #nojs-banner a { color: inherit; text-decoration: underline; } This site requires JavaScript to run correctly. Please turn on JavaScript or unblock scripts