Even though the calculation function is correct, I am getting an ‘incorrect value’ error. thist is the function:
J_style_layer = tf.reduce_sum(tf.square(tf.subtract(GS, GG))) / (4 * (n_H * n_W)**2 * n_C**2)
Can you write your formula in code or latex math expression format, because your current formular takes ** as bold section instead of power.
I tried this formula J_style_layer = tf.reduce_sum(tf.square(tf.subtract(GS, GG))) / (4 * (n_H * n_W)^2 * n_C^2) result is the same =9.216.
the function expect the value is 14.01649. but our calculation result is 2.9225328. I also tried the following formula: (4 * (n_H * n_W)*(n_H * n_W) * (n_C * n_C) )
Is there any error of the test function result?
Hello, @metin_erturkler,
Your formula looks fine, and I assure you that the expected value is correct.
The problem likely lied in your code to compute a_S
and a_G
. In fact, you must use both tf.transpose
and tf.reshape
to do the job instead of just the latter, because the job requires you to reorder the dimensions. While reshape
can change the shape of the tensor, reordering requires transpose
. Mentor Paul has explained it in a link in this post.
Cheers,
Raymond
@metin_erturkler, To elaborate a bit further, the job was to produce a tensor of shape (n_C, n_H * n_W)
from a tensor of shape (1, n_H, n_W, n_C)
.
If we examine the shapes carefully, we see that it is a two-step operation.
The first would be to change it from (1, n_H, n_W, n_C)
to (n_H * n_W, n_C)
. Here we see that the ordering does not change, meaning that it is always Height first, followed by Width and then Channel.
The second is from (n_H * n_W, n_C)
to (n_C, n_H * n_W)
. In this step, we re-order it from the so-called “channel-last” tensor to “channel-first” because, after the change, we have Channel first, followed by Height and then Width.
While reshape
can take care of shape changing that does not change the order, transpose
is needed for the change that gets reordering involved.
To see this, you can consider a simple 2 \times 3 matrix like
A = \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \end{bmatrix}
To change it into a 3 \times 2, we can do it with either reshape
or transpose
but they have different effects.
reshape
gives us B = \begin{bmatrix} 1 & 2 \\ 3 & 4 \\ 5 & 6 \end{bmatrix} and if we read the numbers in A and B from top to bottom and left to right, we find that the ordering of the numbers are the same - they are both 1, 2, 3, 4, 5 and 6.
However, with transpose
, A^T = \begin{bmatrix} 1 & 4 \\ 2 & 5 \\ 3 & 6 \end{bmatrix}, the order changed into 1, 4, 2, 5, 3 and 6 because if we consider A to be row-value-first, A^T is column-value-first. transpose
has changed the order of the dimensions.
Cheers,
Raymond
`
Dear @rmwkwok,
Thank you so much for the detailed explanation and assistance. It really helped me understand the process better. I now see that the job is to produce a tensor of shape (n_C, n_H * n_W) from a tensor of shape (1, n_H, n_W, n_C), and I now understand the two-step operation involved.
Your explanation of how reshape and transpose work was especially helpful. The distinction between changing shape without reordering (using reshape) and changing the order of dimensions (using transpose) is much clearer to me now. I will definitely keep this in mind for future reference.
Thanks again for your support!
Best regards,
You are welcome, @metin_erturkler, and thank you for sharing with me your feedback
Cheers!