Near the end of the lab C1_W1_Lab04_Cost_function_Soln, there is a discussion on the convexity of the cost function and how there must be a minimum for squared-error based cost functions. This allows for the use of gradient of J to find the minimum of J but this is difficult to do if the dimensions of the parameters (w and b) scale differently. My question is: are there ways to choose parameters that are scale invariant so that finding the minimum is easy?

With normalized data, finding the minimum cost is quite easy.