TF1 W1 the first exercise is a super simple model to predict house price from number of rooms. The ys
values are initially provided in units of dollars
, so the house price is order of magnitude 10^6. Meanwhile, the number of rooms is order of magnitude 10^0. I humbly suggest that Spectral Normalization is not the optimal approach or appropriate technology for this level of problem (nor for a learner on the first week of the first TF class). The simple solution is to divide the ys
training inputs by 10^6 before passing them into the fit()
function so xs
and ys
values have the same scale (order of magnitude).
*** EDIT below for additional context and clarity ***
I wrote above “divide…by 10^6…” where it should have been 10^5
What I didn’t address was the ‘How do I do something like that in TensorFlow’ part.
The one word answer is ‘broadcast’
What that means is Python does algebra magik for you when it can. It can take a math operator and apply it to every element of a multi-dimensional object. So
unscaled_prices = [100000, 150000, 200000] #dollars
ten_to_the_fifth = np.power(10,5)
scaled_prices = unscaled_prices / ten_to_the_fifth #hundred thousand dollars
print(scaled_prices)
[1. 1.5 2. ]
You can see the impact on training by building, training, and predicting on two models that differ only by whether or not the prices input is scaled…
model_unscaled_inputs.fit(xs, unscaled_prices, epochs=200)
unscaled = model_unscaled_inputs.predict([7.0])
...
model_scaled_inputs.fit(xs, scaled_prices, epochs=200)
unscaled = model_scaled_inputs.predict([7.0])
print("Unscaled: " + str(unscaled) + " Scaled: " + str(scaled))
Unscaled: [[411107.25]] Scaled: [[4.006471]]
Not only is the scaled version expressed in the units expected by the grader, it is more accurate. This is because the gradients are less steep when both the dependent and independent variables of the model are at least approximately the same scale. HTH