Calculate Gradient on higher order function

In lecture we are finding gradient on a 2,2 shape tensor,
I’m curious why do we have to do reduce sum ?
y = tf.reduced_sum(x)

z is the actual function which is can be written as f(x) = y^2 and x is the tensor input, How do we come to a conclusion as to use reduced_sum of x and use that as input to the above function ?

I’m trying to understand this from a purely mathematical point of view.

He is just using the y function as tf.reduced_sum(x), its his choice! Might as well be any other function used!