Generate Your Own Fake Data In Seconds

Nov 14, 2022

Usually, for executing/testing a pipeline, we need to provide it with some dummy data.

Although using Python's "𝐫𝐚𝐧𝐝𝐨𝐦" library, one can generate random strings, floats, and integers. Yet, being random, it does not output any meaningful data such as people's names, city names, emails, etc.

Here, looking for open-source datasets can get time-consuming. Moreover, it's possible that the dataset you find does not fit pretty well into your requirements.

The 𝐅𝐚𝐤𝐞𝐫 module in Python is a perfect solution to this. Faker allows you to generate highly customized fake (yet meaningful) data quickly. What's more, you can also generate data specific to a demographic.

Daily Dose of Data Science

Discussion about this post

Ready for more?