Week 2 Feature Scaling - Question

In week 2, the course explained about feature scaling topic. My question is when we apply normalization to the input features why don’t we apply normalization to the target / output ? Also Will the weights have a drastic change on them due to gradient descent applied on the normalized data compared to weights learned from original data ?

Thank You !

Does this help?

Choice of feature scaling depends on the algorithm in use. For neural networks, feature scaling is always done. For tree based algorithms, feature scaling is not done.

Feature scaling is never done for target labels.

1 Like

Yes Thanks…If feature scaling is never done on target labels then is it the weights that get optimized to the solution by gradient descent. (Because in the lecture, optimal value for a weight was initially 0.1 for original data, and it became about 20 for normalized data)?

I’d like to clarify one detail that the target label refers to the label of a classfication problem. For a regression problem, the target feature can also be normalized.

Please specify the lecture / timestamp you are referring to.

1 Like

In this lecture https://www.coursera.org/learn/machine-learning/lecture/KMDV3/feature-scaling-part-1 of feature scaling part - 1 at 2:15, the estimated weights for w1 = 0.1 , w2 = 50 and b = 50 for original data. But for normalized data will the weights be changing in a large scale to make the same prediction (since the features are between -1 and 1) ?

I’m sorry, Mohammed. The link you gave is related to MLS and not MLEP. I don’t have access to that course. Please change your post to the correct topic.

Yes, the learned weights will be different.

Oh Ok sir. Still I understood and found your explanations useful. Thank You !

Ok Got it. Thank You!