Exploring and analyzing data is a fundamental aspect of data science.

Here, visualizations play a crucial role in understanding complex patterns and relationships.

They offer a concise way to:

understand the intricacies of statistical models,

validate model assumptions,

evaluate model performance, and much more.

The visual above depicts 9 of the most important and must-know plots in data science.

**KS Plot**: It compares the cumulative distribution functions (CDFs) of a dataset to a theoretical distribution or between two datasets to assess the distributional differences.**SHAP Plot**: It provides a summary of feature importance to a model’s predictions, by considering interactions/dependencies between them.**QQ Plot**: It is used to assess the distributional similarity between observed data and theoretical distribution.Here, we plot the quantiles of the two distributions against each other.

Deviations from the straight line indicate a departure from the assumed distribution.

**Cumulative Explained Variance Plot**: I covered this in a detailed post before: How Many Dimensions Should You Reduce Your Data To When Using PCA?**Gini-Impurity vs. Entropy**: They are used to measure the impurity or disorder of a node or split in a decision tree.The plot compares Gini impurity and Entropy across different splits. This provides insights into the tradeoff between these measures.

**Bias-Variance Tradeoff**: It is used to find the right balance between the bias and the variance of a model.**ROC Curve**: It depicts the trade-off between the true positive rate (TPR) and the false positive rate (FPR) across different classification thresholds.**Precision-Recall Curve**: It depicts the trade-off between Precision and Recall across different classification thresholds.**Elbow Curve**: The plot helps identify the optimal number of clusters for k-means algorithm.

Over to you: What more plots will you include here?

👉 Read what others are saying about this post on LinkedIn and Twitter.

**👉 Tell the world what makes this newsletter special for you by leaving a review here :)**

**👉 If you liked this post, don’t forget to leave a like ❤️. It helps more people discover this newsletter on Substack and tells me that you appreciate reading these daily insights. The button is located towards the bottom of this email.**

👉 If you love reading this newsletter, feel free to share it with friends!

👉 Sponsor the Daily Dose of Data Science Newsletter. More info here: **Sponsorship details**.

Find the code for my tips here: GitHub.

I like to explore, experiment and write about data science concepts and tools. You can read my articles on Medium. Also, you can connect with me on LinkedIn and Twitter.

Thanks a lot really very useful

This summary is really awesome, so many useful ways to understand ML !

However, I would advice against elbow method, as many article showed how wrong it can be. Here is a link of an excellent and recent articles, but they are many more :

https://towardsdatascience.com/are-you-still-using-the-elbow-method-5d271b3063bd

Thanks again for your excellent work 😊