CNN Week 4 Assigment 2, Excercise 6, train_step

Good afternoon. I am almost done with the Week 4 Assigment 2, Excercise 6, train_step, but I am getting the following error:

AssertionError Traceback (most recent call last)
5 print(J1)
6 assert type(J1) == EagerTensor, f"Wrong type {type(J1)} != {EagerTensor}"
----> 7 assert np.isclose(J1, 10221.168), f"Unexpected cost for epoch 0: {J1} != {10221.168}"
9 J2 = train_step(generated_image)

AssertionError: Unexpected cost for epoch 0: -128130720.0 != 10221.168

Now, I recognize that this means there’s some kind of numerical error, right? As the functions are running, just getting the incorrect numbers. I tried to follow the instructions, but thought it was simply copy/paste work we had already done or been given in the previous slides. Perhaps is more complicated than that? Is there something special we need to do other than copy/paste a_G, J_style, J_content, and J_total_cost from the previous problems and the cells immediately above?

Thank you,

Cost cannot be a negative number. So you have a math error.

Did anyone fixed that ? I am still getting this error, and it is weird since my implementation of compute_layer_style_cost() passed the test without giving any negative number. It seems that compute_layer_cost() , which is already implemented in the notebook and is not subjected to any test is the one causing the error.

Here’s a thread which explains one way to get negative cost values in this assignment. But you are saying you don’t get a negative cost, so maybe that is not relevant.

But what do you mean by “this error” in that case then? Please show us the exception trace that you are actually getting.

I found out what was wrong. I got exactly the same error that ChrisML explained in the first post. The error came from the function compute_layer_style_cost(), in the lines that calculates the variable J_style_layer. For calculating on of the terms involved in this calculation, I used the function tf.math.square(n_H * n_W), but this gives wrong values if (n_H*n_W) is big enough to not fit on a int.32 dtype. So, changing n_H and n_W (also n_C, which needs to be squared for calculating the style cost) to int.64 fixed it.

1 Like

Thanks for explaining what happened! It is good to know that there is more than one way to get negative values in the style cost implementation.