I am confused about the use of tf.reshape()
and tf.transform()
in the compute_content_cost()
and compute_layer_style_cost()
functions, and how the shapes of the matrices in these functions are different.
According to the figure above Exercise 1, compute_content_cost()
, the 3D matrix is unrolled into a 2D matrix of shape (n_C, n_H*n_W)
. However, the additional hints for unrolling state that
To unroll the tensor, you want the shape to change from
(m, n_H, n_W, n_C)
to(m, n_H*n_W, nC)
.
Why is that? Thatâs not what the figure shows. What am I missing here?
In fact, both
tf.reshape(a_C, shape=[m, n_H * n_W, n_C])
and
tf.reshape(a_C, shape=[m, n_C, n_H * n_W])
pass the test. No need for tf.transform()
, apparently. More confusion!
Then, in Exercise 3, compute_layer_style_cost()
, it is stated that
the desired unrolled matrix shape is
(đ_đ¶, đ_đ»âđ_đ)
⊠which is indeed what I would expect. Why is the shape different here? And why do we need to use both tf.reshape()
and tf.transform()
here, and not in the content cost function?