Imagine you have a classification dataset. If you use PCA to reduce dimensions, it is inherently assumed that your data is linearly separable.
But it may not be the case always. Thus, PCA will fail in such cases.
If you wish to read how PCA works, I would highly recommend reading one of my previous posts: A Visual and Overly Simplified Guide to PCA.
To resolve this, we use the kernel trick (or the KernelPCA). The idea is to:
Project the data to another space using a kernel function, where the data becomes linearly separable.
Apply the standard PCA algorithm to the transformed data.
For instance, in the image below, the original data is linearly inseparable. Using PCA directly does not produce any desirable results.
But as mentioned above, KernelPCA first transforms the data to a linearly separable space and then applies PCA, resulting in a linearly separable dataset.
Sklearn provides a KernelPCA wrapper, supporting many popularly used kernel functions. You can find more details here: Sklearn Docs.
Having said that, it is also worth noting that the run-time of PCA is cubic in relation to the number of dimensions of the data.
When we use a KernelPCA, typically, the original data (in n
dimensions) is projected to a new higher dimensional space (in m
dimensions; m>n
). Therefore, it increases the overall run-time of PCA.
Over to you: What are some other limitations of PCA that you know of? Let me know :)
👉 If you liked this post, don’t forget to leave a like ❤️. It helps more people discover this newsletter on Substack and tells me that you appreciate reading these daily insights. The button is located towards the bottom of this email.
👉 If you love reading this newsletter, feel free to share it with friends!
Hey there!
Thanks for being an avid reader and supporter of the Daily Dose of Data Science. I am beyond words to express how grateful I am that you make time every day to read this newsletter.
If you have been benefited from this newsletter in any way and my work makes your day just a little better by teaching you something new, then I would really appreciate it if you could write a review for Daily Dose of Data Science below:
Your contribution will immensely help me bring more readers to this newsletter. Thanks a lot for considering my request.
Your biggest fan,
Avi
Find the code for my tips here: GitHub.
I like to explore, experiment and write about data science concepts and tools. You can read my articles on Medium. Also, you can connect with me on LinkedIn and Twitter.
A limitation is a derivative of what you mentioned: how to pick a proper kernel to make the data linearly separable? How do we check we have made our data set linearly separable?
You put in admirable efforts to offer a new angle on data science, each day! Well done