9 Comments
User's avatar
Joe Corliss's avatar

There is a chance (however small) that the random feature will have moderate or high importance and cause us to drop useful features. I would want to run the process 100+ times and drop the features that are identified the most often. But this would require many model fits. Overall, I would prefer to calculate the permutation importances once and see if there's a clear threshold for features worth keeping.

Kumar Abhishek's avatar

Thank you for this post. How do you draw these nice animated arrows? Is there a library for exaclidraw or do you use any other tool?

morgul's avatar

This is like very bare-bones, and worse, version of Boruta's algorithm

Kyle Gilde's avatar

Is there a scikit learn compatible package that will do this for us?

Avi Chawla's avatar

I don't think so. But all you need to do is add a new feature and run the ML algorithm. So it's no that much of an effort :)

I will share if I find any plug-and-play library for this.

George Carter's avatar

Great article. Any recommendations for creating the "random feature" to assure the Probe Method gets the best result?