9 Comments

There is a chance (however small) that the random feature will have moderate or high importance and cause us to drop useful features. I would want to run the process 100+ times and drop the features that are identified the most often. But this would require many model fits. Overall, I would prefer to calculate the permutation importances once and see if there's a clear threshold for features worth keeping.

Expand full comment

Thank you for this post. How do you draw these nice animated arrows? Is there a library for exaclidraw or do you use any other tool?

Expand full comment

I used draw.io, Abhishek.

Expand full comment

This is like very bare-bones, and worse, version of Boruta's algorithm

Expand full comment

Is there a scikit learn compatible package that will do this for us?

Expand full comment

I don't think so. But all you need to do is add a new feature and run the ML algorithm. So it's no that much of an effort :)

I will share if I find any plug-and-play library for this.

Expand full comment

Great article. Any recommendations for creating the "random feature" to assure the Probe Method gets the best result?

Expand full comment