I calculated per sample loss using tf.keras.losses.categorical_crossentropy and did use tf.reduce_sum() to calculate the total loss but the test still not passed with error
Test 1 does not match. Did you get the reduce sum of your loss functions?
I figure it out, the tf.keras.losses.categorical_crossentropy expect different dimension, transposing the input before feeding into the categorical_crossentropy would solve the problem.
Note that the new tf version(mine is 2.15.0) adds axis arg, by setting axis=0 also works.