What Makes Box Plots a Misleading Choice for Data Analysis?
...and here's how to prevent being misled by them.
Box plots are quite common in data analysis.
Yet, they can be highly misleading at times.
To begin, a box plot is a graphical representation of just five numbers:
min
first quartile
median
third quartile
max
Thus, entirely different distributions with similar five values will have identical box plots.
This is evident from the image below:
Three different datasets have the same box plots.
Thus, solely looking at a bar plot may lead to incorrect or misleading conclusions.
Here, the takeaway is not that box plots should not be used. Instead, it’s similar to what we saw in one of the earlier posts about correlation:
Whenever you generate any summary statistic, you lose essential information.
Thus, it is always important to look at the underlying data distribution.
For instance, whenever I create a box plot, I create a violin (or KDE) plot too. This lets me validate whether summary statistics resonate with the data distribution.
👉 Over to you: What other measures do you take when using summary statistics?
👉 If you liked this post, don’t forget to leave a like ❤️. It helps more people discover this newsletter on Substack and tells me that you appreciate reading these daily insights. The button is located towards the bottom of this email.
Thanks for reading!
Latest full articles
If you’re not a full subscriber, here’s what you missed last month:
Sklearn Models are Not Deployment Friendly! Supercharge Them With Tensor Computations.
Deploy, Version Control, and Manage ML Models Right From Your Jupyter Notebook with Modelbit
Model Compression: A Critical Step Towards Efficient Machine Learning.
Generalized Linear Models (GLMs): The Supercharged Linear Regression.
Gaussian Mixture Models (GMMs): The Flexible Twin of KMeans.
To receive all full articles and support the Daily Dose of Data Science, consider subscribing:
👉 Tell the world what makes this newsletter special for you by leaving a review here :)
👉 If you love reading this newsletter, feel free to share it with friends!
Love your content. Helps so much. Thank you