Extrapolation VS Covariant shift

In Course 2, Andrew make a good example that we could use batch norm to train the ML program be more generic such that when we feed in image of black cat, we could generalize well to recognize color cat as well. Yet, in machine learning course offered by Sandford university, Andrew also pointed out we need to be cautious about the extrapolation problem which we try to predict house prices not fall in our training boundaries, An extreme example may be if we have training examples for house price in Boston, it seems weird using that to predict the house price in London. But my problem is that how could we differentiate between these two cases? To what extent should we be confident that we are using batch norm for “proper new type of example” that are not doing extrapolation.

Hope members/mentors could give some good insight.
Many thanks.

Hi @jackchan.hk,

That’s a good question. Off the top of my head, these are the reasons why these two problems differ massively.

  • The cat problem is that of classification (1/0, yes/no), and the house pricing is that of prediction, or regression if you wanna call it, where it is not a yes or no kind of thing, but more like a numerical value that you are predicting.
  • The cat problem is pretty straight forward and more generalised. What I mean by that is, a cat is a cat, anywhere it may be in the world. UK, US… no external factors matter on the classification of that animal, whereas the house pricing is not a generalised problem, in different parts of the world, and even different cities of the same country, the housing prices are dependent on several different factors.

Thank you so much, @Mubsi . I have thought days, and the point of “external factors” really sound. Given a cat picture, SUFFICIENT information is captured in a black cat image. By means of sufficient, we are confident that a normal human could draw similarities between black cat and color cat so that we could learn from black cat image to identify a color cat. Therefore, we could use batch normalization. Yet, for house price prediction, certainly house price in one area is not sufficient enough to generalize the house price in another, and we believe it’s extrapolation.