It looks like you instrumented your gram_matrix
function to print GA, so I added that along with some other instrumentation in compute_style_layer_cost
and here’s what I see:
type(a_G) <class 'tensorflow.python.framework.ops.EagerTensor'>
m is 1
shape(a_S) [1 4 4 3]
shape(a_G) [1 4 4 3]
after reshape+T: shape(a_S) [ 3 16]
after reshape+T: shape(a_G) [ 3 16]
a_S [[ 2.6123514 6.3462267 -2.0568852 1.0848972 -0.34144378 5.9450154
-0.68249106 6.6380787 4.405425 3.2136106 -0.88850987 8.216282
1.1940846 -3.8442307 4.486284 -3.2315984 ]
[-3.3520837 3.8470404 -3.1489944 -1.2055032 -3.17067 -1.7347562
-3.1652112 -0.90944517 0.31713337 0.23800504 -0.10706711 0.6901974
1.7071393 6.2297974 -2.2124014 -5.5964684 ]
[ 0.74761856 -0.95714605 -4.0077353 -5.972679 5.036553 3.6944358
-1.7189786 9.18924 2.566379 1.4399388 -1.7099016 3.6196625
-0.9568796 2.8206615 -4.4811783 3.4338741 ]]
GA =
[[277.3386 1.9207706 84.06223 ]
[ 1.9207706 139.91864 -1.1867776]
[ 84.06223 -1.1867776 244.99716 ]]
GA =
[[277.3386 1.9207706 84.06223 ]
[ 1.9207706 139.91864 -1.1867776]
[ 84.06223 -1.1867776 244.99716 ]]
J_style_layer 0.0
type(a_G) <class 'tensorflow.python.framework.ops.EagerTensor'>
m is 1
shape(a_S) [1 4 4 3]
shape(a_G) [1 4 4 3]
after reshape+T: shape(a_S) [ 3 16]
after reshape+T: shape(a_G) [ 3 16]
a_S [[-3.404881 -2.5186315 1.3512313 -1.8821764 -0.39341784 5.434381
-0.2787553 3.5750656 -2.616547 1.2345984 0.25895953 -2.933355
-5.2402277 0.19823879 1.3253293 -0.41526246]
[ 7.183007 -3.8986888 0.18695849 -1.5039697 -0.34587932 6.118635
2.493302 9.585233 6.579515 -0.9685255 -0.5711074 2.555351
0.36834985 7.452428 6.3523054 0.583993 ]
[ 2.534576 -2.9244845 -1.2326248 -1.8601038 1.730303 0.91409665
2.0111642 -2.3005989 5.8995004 -2.2799122 -1.6340904 -3.1489797
-0.42677724 3.718691 5.739221 -2.0045857 ]]
GA =
[[112.29061 37.05909 -1.9554844]
[ 37.05909 352.1803 116.86712 ]
[ -1.9554844 116.86712 136.64961 ]]
GA =
[[277.3386 1.9207706 84.06223 ]
[ 1.9207706 139.91864 -1.1867776]
[ 84.06223 -1.1867776 244.99716 ]]
J_style_layer 14.01648998260498
J_style_layer = tf.Tensor(14.01649, shape=(), dtype=float32)
All tests passed
You can see that the GA values I get are significantly different than yours. One possible mistake is to directly reshape to the final shape you want for a_S
and a_G
instead of including a transpose to preserve the channels dimension.
Here’s an earlier thread about this issue. And that thread points to this one, which demonstrates why just the direct reshape doesn’t work.