Oct 24, 2023·edited Oct 24, 2023Liked by Avi Chawla

If dataset 1 is stored in the matrix X (rows = observations, columns = variables) and dataset 2 is stored in Y, we could calculate X^T*X and Y^T*Y, which removes the difference in sample size and is independent of the orders of the rows. Then we could calculate the Frobenius distance between them, ||X^T*X - Y^T*Y||_F. Not sure if it means anything though, just a thought.

This is an important topic which i face a lot of time on my daily work. For me, my big question is: how to use previous data and model to adapt for this covariate shift?

In a regression (or classification) framework, most time the new data is arriving which is different for the one used to built the model, but we do not have targets/labels for this new data. Thus, (re)train a (new) model on the new data is not feasible. Which adaptation we can follow to deal with this question?

edited Oct 24, 2023If dataset 1 is stored in the matrix X (rows = observations, columns = variables) and dataset 2 is stored in Y, we could calculate X^T*X and Y^T*Y, which removes the difference in sample size and is independent of the orders of the rows. Then we could calculate the Frobenius distance between them, ||X^T*X - Y^T*Y||_F. Not sure if it means anything though, just a thought.

Hi Avi Chawla, Hi Everyone!

This is an important topic which i face a lot of time on my daily work. For me, my big question is: how to use previous data and model to adapt for this covariate shift?

In a regression (or classification) framework, most time the new data is arriving which is different for the one used to built the model, but we do not have targets/labels for this new data. Thus, (re)train a (new) model on the new data is not feasible. Which adaptation we can follow to deal with this question?

Thank you,