When doing rnn backward, I am not understand why dWax=dtanh(a_next) . x. Please help thanks
Assume the cost function is J. According to chain rule and equation (1):
When doing rnn backward, I am not understand why dWax=dtanh(a_next) . x. Please help thanks
Assume the cost function is J. According to chain rule and equation (1):