Course 4, week 4 Neural Style Transfer: train_step "Unexpected cost for epoch 0"

I’m getting an error, I double-checked the outputs of my previous functions and they all seem to be passing. I also ran the auto-grader to see if it would catch something with previous functions, but it’s only finding issue with train_step.

tf.Tensor(-128153970.0, shape=(), dtype=float32)
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-54-e8adb96bb73f> in <module>
      5 print(J1)
      6 assert type(J1) == EagerTensor, f"Wrong type {type(J1)} != {EagerTensor}"
----> 7 assert np.isclose(J1, 10221.168), f"Unexpected cost for epoch 0: {J1} != {10221.168}"
      8 
      9 J2 = train_step(generated_image)

AssertionError: Unexpected cost for epoch 0: -128153968.0 != 10221.168

When I look at the output of 5.5.2 - Compute Style Cost it seems WAY too big:

# Assign the input of the model to be the "style" image 
preprocessed_style =  tf.Variable(tf.image.convert_image_dtype(style_image, tf.float32))
a_S = vgg_model_outputs(preprocessed_style)
# Compute the style cost
J_style = compute_style_cost(a_S, a_G)
print(J_style)

:: tf.Tensor(-7559850.0, shape=(), dtype=float32)

There must be something fundamentally wrong: cost values should always be positive, right? All the cost values here are the “sum of squares”, so they are positive by definition. You need to carefully re-examine your implementation and compare the code you wrote to the mathematical formulas given for the two costs (style and content).

I went through an changed the tf.squared to **2, it shouldn’t have made a difference , but now it works. I must have had a typo or something that I fixed in the process. It’s weird that the unit tests and autograder would pass when the latter results for the cost were so off.

6 Likes

Wow! This just solved my problem as well. Very strange indeed :confused:

1 Like

Hey, could you tell me how are you using the optimizer line in the train_step function? Am still having the same error despite converting the tf. square to **
I

Hi Ash, I did not update the optimizer (it is already defined).

In ‘train_step’ function the only things I updated were a_G, J_style, J_content and J.

Hope this helps.

1 Like

Hey Agestau, thanks for your reply but it seems like am still stuck in the same error as you guys were even after I changed the squares to ** 2 :no_mouth:

Hey Ash (team mystic here)

I’m tied up with childcare this afternoon, but I can help you trace through the expected output of the non-graded code blocks, maybe that will help you find where the issue is. I can jump on around 7:30/8pm eastern.

Yeah me too, it’s getting pretty late here (it’s past 10 pm) so I have to log off.

However, please note that the error I was getting was not the same as Anomy was getting. Idk how the forum works, but I think you could try and find my entry. It’s just that my error was solved using the same method.

I would suggest just trying to copy and paste your error message to the search and look through the entries, try different things. Another advice would be to log off for a few hours. This always gives me a new perspective and I am able to notice if I’ve done something incorrectly. Good luck and sorry i cannot be of more help.

1 Like

Very good advice, my first job back in the day when I was a Game Dev. My boss told me, “if you have a great idea at midnight, write it down and go to bed. It probably won’t be good in the morning”

Welp I went to experiment with different images before I submitted, and I’m no longer getting a correct response. Will ping again when I figure out what it is.

Ok - so in the end it was because when I experimented with the images of my own, I didn’t change the images back to the correct version. Section 6 “Test With Your Own Image” includes what the original files are called.

I’m glad I looked here for help. I had the exact same problem. Does anyone know why tf.square did not work but **2 did?

This is strange. I was stuck for an hour, then saw this post and replaced tf.square with **2 and all worked perfectly (this is with all unit tests working using tf.square).

After submitting, I went back to try and understand the problem and put back tf.square everywhere I had **2. This time, all worked with tf.square. I cannot reproduce the issue, but I know for certain that is the only change I made between it not working and working.

1 Like

This result was caused by each calculation method the python and tensorflow.
pythons calculation is free of type restriction but tensorflows method isnt. so, tf.squares datatype has tf.float32 and **2s datatype has int. In UNQ_C3, Test numbers has so small. It doesnt problem.
But In 5.5.2 - Compute Style Cost, We solve the large number problem.

This number is out of range the tf.float32.
This is overflow and You can see the log about overflow below.

In UNQ_C3,
J_style_layer = …
print(f’**2 : {type((2 * n_H * n_W * n_C)**2)}, value : {((2 * n_H * n_W * n_C)**2)}’)
print(f’tf.square : {tf.cast(tf.square((2 * n_H * n_W * n_C)), tf.float32).dtype}, value : {tf.cast(tf.square((2 * n_H * n_W * n_C)), tf.float32)}’)

result #### 5.5.2 - Compute Style Cost
**2 : <class ‘int’>, value : 104857600000000
tf.square : <dtype: ‘float32’>, value : 268435456.0
**2 : <class ‘int’>, value : 26214400000000
tf.square : <dtype: ‘float32’>, value : -2080374784.0
**2 : <class ‘int’>, value : 6553600000000
tf.square : <dtype: ‘float32’>, value : -520093696.0
**2 : <class ‘int’>, value : 409600000000
tf.square : <dtype: ‘float32’>, value : 1578106880.0
**2 : <class ‘int’>, value : 409600000000
tf.square : <dtype: ‘float32’>, value : 1578106880.0

For what it’s worth, i had the same issue.

I fixed this by restarting the kernel and running all cells, with no changes it just worked…

1 Like

Hi,
I had the same issue. Was stuck on this for more than 1 hour. Like @orokusaki said, I just restarted the kernel and run all again. It worked like a charm!

2 Likes

Most likely that means you had changed the code, but had not actually executed the changed cell by directly clicking “Shift-Enter” to run it. Just typing new code and then calling the function again runs the old code. You can easily demonstrate this behavior to yourself: enter a syntax error in a function that passed its test case and then call it again. It still works. Now click “Shift-Enter” on the newly broken cell and call it again. Kabooom.

No, I did hit Shift + Enter to run the code again (I saw the index of both function cell and test cell increased) @paulinpaloalto

I’m glad to hear that you were aware of the way “compiling” cells work. But then my theory is just that the change that you forgot to “execute” must have been in some other cell than the particular one you were concerned about. Note that call graph of style cost includes several of the earlier functions. This is Science, right? What’s your theory for how doing a “Kernel → Restart and Clear Output” fixed your bugs.