3 Comments

“In this case, the dataset overlap between any two trees is expected to be huge compared to the typical random forest.”

Is this a typo, or did I misunderstand? In a batching context isn’t the batch size normally much smaller than the whole dataset? And wouldn’t that imply minimal overlap in datasets between trees compared to a typical random forest? I agree though this would aid the bagging objective and reduce bias.

Expand full comment
author

I am sorry I made a mistake there, Joseph. I wanted to write "is NOT expected to be huge"

Thanks so much for pointing that out. Correcting it right away.

Expand full comment

That makes sense, glad to help! Keep up the great writing!

Expand full comment