How Does MiniBatchKMeans Works?

Jul 6, 2024

A step-by-step visual guide to mini-batch KMeans.

1 Comment

Thank you for sharing this algorithm which I found it very rare on the internet. I like your valuable content. However, I found a mistake in your algorithm. The value of count and sum-vector should be cumulative and not reset after each iteration. I found this in the main paper of Mini-Batch published in 2010. You could read it on this link: https://ra.ethz.ch/cdstore/www2010/www/p1177.pdf

If we reset the sum-vector in each iteration, the new mini-batch might shift the centroid location - computed from the previous min-batch - significantly. Therefore, the value should be accumulative because more iterations change slightly the centroid location.

BTW, there is also the term of learning rate which is the reciprocal of the count value.

Expand full comment

Daily Dose of Data Science

How Does MiniBatchKMeans Works?