Is covariate shift the same as data drift?

Bassel · November 8, 2022, 9:15pm

Is covariate shift another name for data drift?

balaji.ambresh · November 9, 2022, 5:46am

Dataset shift and covariate shift are different.

Dataset shift means that the joint distribution of targets and the input features are different across training and serving environments.

Covariate shift happens when the conditional distribution of y with respect to x is the same across both training and serving environments, but the marginal distributions of x is different (i.e. distributions of inputs to the model are different across training and serving environments).

Alessio_Molinari · March 12, 2023, 5:40pm

Actually I also have the same question, and I believe the reply from @balaji.ambresh does not answer what was asked.

Could we also conclude that Concept shift is another name for Concept drift?

In this blog post from Matthew Stewart PhD, Postdoc in ML at Harvard, Concept shift and Concept drift are used interchangeably. Is it a mistake?

He also mentions that Covariate shift and Concept shift are types of Dataset shift (I am not sure if Robert Crowe is saying the same thing)

There are multiple manifestations of dataset shift that we will examine:

Covariate shift

Prior probability shift

Concept shift [then referred to as Concept drift]

Internal covariate shift (an important subtype of covariate shift)

Assuming that this is correct, Robert Crowe also says in the lesson “Detecting data issue” that Dataset shift happens “when the data has shifted over time”, but this is also the definition of drift (from the same lesson) and then I honestly can’t understand anymore the distinction between drift and skew (because Dataset, Covariate, and Concept shift are all types of distribution skew, or did I misundertand this?)

Juan_Olano · March 12, 2023, 6:21pm

Regarding covariate shift and data drift, this is my understanding:

I understand that Covariate Shift happens when the distribution of the features changes between datasets (train, val, test). This can happen, for example, when the source of the datasets is different. Example: generating training data with one camera, and test data with another camera that produces images with different characteristics.

I understand that Data Drift happens when the distribution of the features changes over time. An example I use to explain this to myself is Consumer Behavior. I could train a model with data acquired at a certain point in time, and at that time the model predicts properly, but ,as time passes, Consumer Behavior changes so the ‘data drifts’ and the model will not be as good at predicting with the new distribution of the features.

What do you think?

Alessio_Molinari · March 14, 2023, 4:28pm

Thanks a lot! Your reply is really helpful! I still have some doubts though:

To detect distribution skew, we compare train set and serving set, but if I understand correctly, the serving set is simply made of the queries received by the ML system in production, so how can we consider that static? I would definitely agree that we can apply the concept of distribution skew for train and test sets, which are typically fixed at training time.

Even Robert says that distribution manifests itself through dataset shift (which can be either covariate shift or concept shift) that is when “data has shifted over time”, so he mentions the time component in a concept which is previously defined as static. I see a bit of a contradiction here, and I would love to hear other people’s opinion about it.

But the difference between skew and drift is clearer to me now, thank you. To make it even more understandable, would you agree that:

data drift is essentially covariate shift in time, and
concept drift is essentially concept shift in time?

Juan_Olano · March 14, 2023, 11:16pm

HI @Alessio_Molinari ,

Thank you for your reply and insights.

I would agree that data drift = concept drift (represented by a change in the relationship between the features and the labels), and that covariate shift is a particular case of data drift (represented by a change in the distribution between training features and new features).

Topic		Replies	Views
Concept not so clear on dataset shift, covariate shift and concept shift Machine Learning Data Lifecycle in Production	6	549	May 2, 2023
Is there any other kind of shift / drift belong to prior probability shift but not concept shift? Machine Learning Modeling Pipelines in Production	5	483	September 28, 2023
Isn't Feature Skew one form of Distribution Skew Machine Learning Data Lifecycle in Production	2	540	July 10, 2021
What is the correct name for data and concept drift? Machine Learning Data Lifecycle in Production	6	600	November 30, 2022
Detecting data issues lecture Introduction to Machine Learning in Production	1	512	September 13, 2022

Is covariate shift the same as data drift?

Related topics