The Probe Method: A Reliable and Intuitive…

Sep 21, 2023

Introduce a bad feature to remove other bad features.

9 Comments

There is a chance (however small) that the random feature will have moderate or high importance and cause us to drop useful features. I would want to run the process 100+ times and drop the features that are identified the most often. But this would require many model fits. Overall, I would prefer to calculate the permutation importances once and see if there's a clear threshold for features worth keeping.

Expand full comment

Thank you for this post. How do you draw these nice animated arrows? Is there a library for exaclidraw or do you use any other tool?

Expand full comment

I used draw.io, Abhishek.

Expand full comment

This is like very bare-bones, and worse, version of Boruta's algorithm

Expand full comment

Is there a scikit learn compatible package that will do this for us?

Expand full comment

I don't think so. But all you need to do is add a new feature and run the ML algorithm. So it's no that much of an effort :)

I will share if I find any plug-and-play library for this.

Expand full comment

Great article. Any recommendations for creating the "random feature" to assure the Probe Method gets the best result?

Expand full comment

I would use scipy.stats.norm.rvs - https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.norm.html

Expand full comment

Thanks.

Expand full comment

#nojs-banner { position: fixed; bottom: 0; left: 0; padding: 16px 16px 16px 32px; width: 100%; box-sizing: border-box; background: red; color: white; font-family: -apple-system, "Segoe UI", Roboto, Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol"; font-size: 13px; line-height: 13px; } #nojs-banner a { color: inherit; text-decoration: underline; } This site requires JavaScript to run correctly. Please turn on JavaScript or unblock scripts