Course 4 - Week 4 - Assignment 2 - Exercise 3 - Neural Style Transfer - compute_layer_style_cost - Wrong Value

In Assignment 2,
I’ve executed all previous exercises and they all tested OK.

In Exercise 3, compute_layer_style_cost:

    J_style_layer_GG == 0.0           <-- OK 
    J_style_layer_SG > 0              <-- OK
    J_style_layer_SG == 4613.86084    <-- WRONG VALUE

Tensor shapes look all ok:

     A_G, A_S:  (1, 4, 4, 3) --> (3, 16)
     GS, GG:    (3, 3)

The first test case goes OK, so

    compute_layer_style_cost(a_G, a_G) == 0.0 

The second one fails:

    compute_layer_style_cost(a_S, a_G) == 4613.86084

These are the intermediate tensor-values I get:

a_S_reshaped:

[[ -3.404881   -2.51863146  1.35123134. ... 0.19823879 1.3253293 -0.415262461 ]
 [  7.18300676 -3.89868879  0.186958492 ... 7.45242786 6.35230541 0.583993    ]
 [  2.53457594 -2.92448449 -1.23262477  ... 3.71869111 5.7392211 -2.00458574  ]]

a_G_reshaped:

[[  2.61235142   6.34622669 -2.05688524 ... -3.84423065  4.48628378 -3.23159838 ]
 [ -3.35208321   3.84704041 -3.14899445 ...  6.22979736 -2.21240115 -5.59646845 ]
 [  0.747618556 -0.95714581 -4.00773525 ...  2.82066154 -4.48117828  3.43387413 ]]

GS:

[[  112.321625    37.0738297  -1.95214272 ]
 [   37.0738297  352.194031  116.885025   ]
 [   -1.95214272 116.885025  136.679352   ]] 

GG:

[[ 277.302155     1.89707661  84.0645676  ]
 [   1.89707661 139.898712    -1.18903875 ]
 [  84.0645676   -1.18903875 245.052292   ]] 

And finally, I get the wrong result value:

J_style_layer_SG == 4613.86084

The formula for style-Cost also seems fine, and the code of honor asks not to post it here.

I can’t tell if previous matrixes are ok or not, but all test cases work up to there, and shapes are OK

Can someone help me find where’s the issue?

I printed out a couple of those intermediate values and mine agree with yours:

a_S [[-3.404881   -2.5186315   1.3512313  -1.8821764  -0.39341784  5.434381
  -0.2787553   3.5750656  -2.616547    1.2345984   0.25895953 -2.933355
  -5.2402277   0.19823879  1.3253293  -0.41526246]
 [ 7.183007   -3.8986888   0.18695849 -1.5039697  -0.34587932  6.118635
   2.493302    9.585232    6.5795145  -0.9685255  -0.5711074   2.555351
   0.36834985  7.452428    6.3523054   0.583993  ]
 [ 2.534576   -2.9244845  -1.2326248  -1.8601038   1.730303    0.91409665
   2.0111642  -2.3005989   5.8995004  -2.2799122  -1.6340902  -3.1489797
  -0.42677724  3.718691    5.739221   -2.0045857 ]]
shape(GS) [3 3]
GS [[112.321625   37.07383    -1.9521427]
 [ 37.07383   352.19403   116.885025 ]
 [ -1.9521427 116.885025  136.67935  ]]
J_style_layer_GG 0.0
J_style_layer_SG 14.017805099487305
J_style_layer = tf.Tensor(14.017805, shape=(), dtype=float32)

I think what this means that the issue is with the final computation. Did you use TF primitives to compute the scalar factors? If so, that is probably the source of the issue. TF has different type coercion rules for dealing with mixing integers and floats and that can result in crazy wrong values for this term:
\displaystyle \frac {1}{(2 * n_C * n_H * n_W)^2}
Try using numpy functions or just straight scalar python arithmetic for that part of the formula and see if that helps. Also be careful about “order of operations” issues in computing that term. Try running the following code to see one example of such an issue:

m = 5.
x = 1. / 2. * m
y = 1. / (2. * m)

If you’re expecting x and y to have the same value, you’re in for a surprise. And not the fun kind of surprise. :nerd_face:

2 Likes

Thanks a lot, Paulin, you were right.

The issue was in the scalar part.
Not in the math formulation or parenthesis, but in the functions used.

I changed that expression and everything worked just fine.

Those subtleties are hard to spot!

It’s great that you found the solution! Thanks for confirming.

1 Like