Better still is not to impute anything but rather to leave it up to each model's perspective how to treat missing values. For example "distance functions" can be customised and in some cases asymmetric (e.g. reflecting some aspect of the application domain). Preprocessing data presupposes downstream purposes (that might change over time).
I've always wanted to try something like this. The iterative approach is interesting. I wonder how much the iteration improves on a single step with a model trained on the non-missing values.
Also, if every feature has some missing data, I guess you can't use MissForest without some modification. Maybe there's another algorithm for this case.
Better still is not to impute anything but rather to leave it up to each model's perspective how to treat missing values. For example "distance functions" can be customised and in some cases asymmetric (e.g. reflecting some aspect of the application domain). Preprocessing data presupposes downstream purposes (that might change over time).
Outlier-tolerant e.g. median would be better than mean - though in your example that would just place the spike in a "better" place.
I've always wanted to try something like this. The iterative approach is interesting. I wonder how much the iteration improves on a single step with a model trained on the non-missing values.
Also, if every feature has some missing data, I guess you can't use MissForest without some modification. Maybe there's another algorithm for this case.