3 Comments
Jul 9, 2023Liked by Avi Chawla

Well, of course density-based clustering validation gives a higher score to a density-based clustering algorithm. For this example, we only know that DBCV gives a better clustering because it's obvious from plotting the data. In ten dimensions, how do you know which algorithm and which metric will give the best result?

Expand full comment
author

Of course, visualization is infeasible in such cases. So mostly, we prefer dimensionality reduction using techniques like t-SNE. And there is obviously no guidelines on which algo/metric will work better. In case of missing labels, one has to approach with intrinsic measures and in such cases, the evaluation is entirely subjective.

Expand full comment

Hi Jean, where did you find out about this disadvantage of the Silhouette, in what article? I need to cite this in my dissertation.

Expand full comment