There is a chance (however small) that the random feature will have moderate or high importance and cause us to drop useful features. I would want to run the process 100+ times and drop the features that are identified the most often. But this would require many model fits. Overall, I would prefer to calculate the permutation importances once and see if there's a clear threshold for features worth keeping.
There is a chance (however small) that the random feature will have moderate or high importance and cause us to drop useful features. I would want to run the process 100+ times and drop the features that are identified the most often. But this would require many model fits. Overall, I would prefer to calculate the permutation importances once and see if there's a clear threshold for features worth keeping.
Thank you for this post. How do you draw these nice animated arrows? Is there a library for exaclidraw or do you use any other tool?
I used draw.io, Abhishek.
This is like very bare-bones, and worse, version of Boruta's algorithm
Is there a scikit learn compatible package that will do this for us?
I don't think so. But all you need to do is add a new feature and run the ML algorithm. So it's no that much of an effort :)
I will share if I find any plug-and-play library for this.
Great article. Any recommendations for creating the "random feature" to assure the Probe Method gets the best result?
I would use scipy.stats.norm.rvs - https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.norm.html
Thanks.