- how do you quantitatively say that there are high-importance features, and low importance features after training the random forest? Do you use a threshold or do you cluster the values?
- given a real-time ML system, at which frequency do you use proxy-labeling techniques? Curious to know if you have thought about an architecture about this.
Great share avi bro. I think we can do the same by using some statistical methods
like kolmogorov-simnrov test helps us to detect whether those two distributions are same or not.
Interesting technique. I have two questions:
- how do you quantitatively say that there are high-importance features, and low importance features after training the random forest? Do you use a threshold or do you cluster the values?
- given a real-time ML system, at which frequency do you use proxy-labeling techniques? Curious to know if you have thought about an architecture about this.