C4 W4 Assignment 2 Exercise 6

Hi there,
When running the train_step I’m getting the error:

<ipython-input-64-474817c30aec>:34 train_step  *
        J = total_cost(J_content, J_style, 10, 40)
    <ipython-input-15-373dbbefb38e>:20 total_cost  *
        J = alpha * J_content + beta * J_style
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/math_ops.py:1141 binary_op_wrapper
        raise e
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/math_ops.py:1125 binary_op_wrapper
        return func(x, y, name=name)
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/util/dispatch.py:201 wrapper
        return target(*args, **kwargs)
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/math_ops.py:1447 _add_dispatch
        return gen_math_ops.add_v2(x, y, name=name)
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/gen_math_ops.py:496 add_v2
        "AddV2", x=x, y=y, name=name)
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/op_def_library.py:506 _apply_op_helper

    TypeError: Input 'y' of 'AddV2' Op has type float64 that does not match type float32 of argument 'x'.

It looks like it’s a type error when running J = total_cost(J_content, J_style, 10, 40)
When I check the type of J_content and J_style I see the following:

Tensor("mul_20:0", shape=(), dtype=float32)
Tensor("add_4:0", shape=(), dtype=float64)

I don’t think that’s what I should be expecting. From earlier cells (5.5.1 and 5.5.2 the returned tensor is a tensor representing the scalar content cost and scalar style cost respectively. e.g. tf.Tensor(0.008070096, shape=(), dtype=float32)
tf.Tensor(-7424263.731670428, shape=(), dtype=float64)

I can’t understand why when I calculate the content and style cost, I’m getting this strange tensor value. It might be that I’ve misinterpreted how to handle the generated_image input to the train_step?
a_G = vgg_model_outputs(generated_image)
or how to call compute style cost?
J_style = compute_style_cost(a_S, a_G)

Help appreciated! I’ve been stuck for a while…

Debugging this assignment is really tricky, because most of the values you’re able to inspect within the notebook are only tensor layer definitions. Most of the real-printable values are only generated for the final cost calculations.

Your tf.Tensor for J_style seems incorrect, here’s what I have (in Section 5.5.2)
tf.Tensor(598.82825, shape=(), dtype=float32)

So there may be a problem in your compute_layer_style_cost() function.

I say this because cost values are (by definition) always positive. Your J_style unit test value appears to be a negative number.

In the display of your “Tensor” types:

  • In my notebook, I don’t have “mul_20:0”. I have “truediv_5:0”. But that’s not necessarily a problem, it depends entirely on how you implemented the content cost code for this equation:

  • The “add_4:0” is correct, though the dtype should be float32. So there may be some small difference in your code for compute_layer_style_cost().

The default name given for the tensors comes from the operation in last layer in that calculation.

If you get negative values for the cost on this one, it’s a weird type coercion problem caused by using integers mixed with floats in TF math operations. Here’s a thread which shows at least one way to get negative values.

Since the absolute value of your result is also much larger than Tom’s, I suggest you also look for “order of operations” issues in that fractional factor. Try this and watch what happens:

m = 5.
x = 1. / 2. * m
y = 1. / (2. * m)

The fractional expression in question is pretty complicated and it’s easy to get the parens wrong.

1 Like

Hi @TMosh and @paulinpaloalto,
Thankyou for taking the time to reply.
@TMosh you’re right Debugging is very difficult, especially within the notebook.
@paulinpaloalto the link that you shared is spot on. That’s precisely the problem I was experiencing in the function that calculates the style cost. The division was using integers.

I put that down to my lack of programming / python.
I’m relieved that the problem wasn’t to do with elementary “order of operations” :slight_smile: Though I might have had more chance of finding that sooner :slight_smile: