DLS Course 5: Week 1 Assignment 1

I was going throuh the LSTM Backward Pass, which says

da_{prev} = W_f^T d\gamma_f^{\langle t \rangle} + W_u^T d\gamma_u^{\langle t \rangle}+ W_c^T dp\widetilde c^{\langle t \rangle} + W_o^T d\gamma_o^{\langle t \rangle} \tag{19}

dx^{\langle t \rangle} = W_f^T d\gamma_f^{\langle t \rangle} + W_u^T d\gamma_u^{\langle t \rangle}+ W_c^T dp\widetilde c^{\langle t \rangle} + W_o^T d\gamma_o^{\langle t \rangle}\tag{21}

While we are defining the parameters we say

Wf = np.random.randn(5,8)

where dft has a shape of (5,10) given this, how are you supposed to end up with a dxt having the shape (3,10)? For more context Wf.T is of shape (8,5) which is multiplied with a matrix of shape (5,10), the output matrix will thus be of the shape (8,10) which is what I’m getting. What am I missing here?

See this hint:

Here, to account for concatenation, the weights for equations 19 are the first n_a, (i.e. W_f = W_f[:,:n_a] etc…)

1 Like

Yep, I should read a bit more carefully.