Discussion about this post

User's avatar
Sergey Skripko's avatar

A couple of ideas:

1. Is it possible that those top-k trees will be highly correlated to each other? I mean, their top predictors, root nodes, will look similar? From that perspective, won't it be more efficient to take top-k trees with some step, like every 3rd, to reduce this effect? Have you checked it?

2. After we picked top-k trees, we can improve the metric even more, calculating the residuals after top-k trees and fitting xgboost/any other boost against the residuals

Expand full comment
1 more comment...

No posts