So like I am using a custom conv layer for forward prop and backward prop, Now like I want to see if my backward prop is using correct weights that I am giving in my custom layer.
What do i do?
I want to understand backward prop in detail if there are some resources available
@Amit_IITB what exactly do you mean by a custom conv layer ? Like, just different size, stepping, padding or ? And are you doing this as part of the pure numpy implementation or in Tensorflow ?
@Nevermnd I am using pytorch autograd and Straight through estimator to define custom layer.
I am using different weight in forward prop and backward prop so like I want to see is it doing the correct way or not. If it is working correctly than I want to understand it
@Amit_IITB ohhh, okay so not about/from this course. Sorry I know TF not Pytorch so someone else might have the answer.
@Nevermnd ohhkk thanks for replying. It is related to some small project that I am doing. I have done this course two months back.
Sorry, but what do you mean by that statement? In every layer of every model I’ve ever seen there is one set of weights (parameters): in forward prop you simply use those weights to compute the output. In backward prop, you update those weights so that they give a better result on the next iteration when you run forward prop.
Torch autograd is the same general kind of mechanism as “autodiff” in TF: it uses finite difference approximations of the gradients for the functions in the cases it does not have an analytic derivative of the given function. So it should “just work” if you’ve defined everything in a consistent way.
Torch will let you extract and print or examine the parameters if you like. But figuring out how to make sense of a big tensor full of weights is a challenge all by itself. How would you know whether those are good values or not just by looking at them? Or you can judge whether things are working from the results: does your gradient descent actually work (converge)?
For custom layers, in TensorFlow, we use gradient tape to find derivative. I am sure there should be similar thing available in Pytorch but I am not aware of it.
To find a derivative of a custom layer, first you should know calculus. Do it on a paper. And, then you should be familiar with gradient tape things. Advance TensorFlow specialization covers this TF method. You may also need to learn how to do a custom training loop.