I was going throuh the LSTM Backward Pass, which says
da_{prev} = W_f^T d\gamma_f^{\langle t \rangle} + W_u^T d\gamma_u^{\langle t \rangle}+ W_c^T dp\widetilde c^{\langle t \rangle} + W_o^T d\gamma_o^{\langle t \rangle} \tag{19}
dx^{\langle t \rangle} = W_f^T d\gamma_f^{\langle t \rangle} + W_u^T d\gamma_u^{\langle t \rangle}+ W_c^T dp\widetilde c^{\langle t \rangle} + W_o^T d\gamma_o^{\langle t \rangle}\tag{21}
While we are defining the parameters we say
Wf = np.random.randn(5,8)
where dft
has a shape of (5,10)
given this, how are you supposed to end up with a dxt
having the shape (3,10)
? For more context Wf.T
is of shape (8,5)
which is multiplied with a matrix of shape (5,10)
, the output matrix will thus be of the shape (8,10)
which is what I’m getting. What am I missing here?