Become A Bilingual Data Scientist With These Pandas to SQL Translations
Most common Pandas operations and their SQL translations in one frame.
SQL and Pandas are both powerful tools for data scientists to work with data.
Together, SQL and Pandas can be used to clean, transform, and analyze large datasets, and to create complex data pipelines and models.
Thus, proficiency in both frameworks can be extremely valuable to data scientists.
This visual depicts a few common operations in Pandas and their corresponding translations in SQL.
I have a detailed blog on Pandas to SQL translations with many more examples. Read it here: Pandas to SQL blog.
Over to you: What other Pandas to SQL translations will you include here?
👉 Read what others are saying about this post on LinkedIn and Twitter.
👉 If you liked this post, don’t forget to leave a like ❤️. It helps more people discover this newsletter on Substack and tells me that you appreciate reading these daily insights. The button is located towards the bottom of this email.
👉 If you love reading this newsletter, feel free to share it with friends!
Find the code for my tips here: GitHub.
I like to explore, experiment and write about data science concepts and tools. You can read my articles on Medium. Also, you can connect with me on LinkedIn and Twitter.
Thanks Avi, but one point: COUNT(*) only returns the number of rows in a table, so it's not the full equivalent of df.shape.
For the number of columns one can use:
SELECT COUNT(*)
FROM INFORMATION_SCHEMA.COLUMNS
WHERE table_catalog = 'database_name' -- the database
AND table_name = 'table_name'