RNN Architecture

Hello Team,

I feel extremely happy to start the 5th course of this deep learning specialization.
My question is related to RNN architecture where Tx = Ty

We are passing A[1] to the 2nd layer in the RNN. I believe that A<1> is not equal to yhat<1>. Correct me If I am wrong.
Why are we passing A[1] instead of yhat<1> to the 2nd layer in the RNN?


Hi ajaykumar3456,

a<1> is the (trainable or trained) activation value of the RNN that determines the translation from input x<1> to output y<1>. In order to be able to determine the relation between x<2> and yhat<2>, the information captured by a<1> is passed to the second time step, where it is modified to a<2> and determines the value of yhat<2> on the basis of x<2>. This allows information from the previous step to help out in producing the output at the step that follos.

So a refers to the activation values in the RNN, whereas x is the input and yhat the output. The output depends on the activation values through an function with parameters, selecting the output value based on the activation values. See the video at 10:00. So a is different from yhat, and a is needed to determine the translation from x to y by passing information from one time step to the next.

I hope this clarifies.