Week 1 Assignment 1 - what are these weights in RNN?

I just implemented the RNN from scratch in the assignment.
But I am still not able to understand where these weights are located and how are these being used. I really want to visualize RNNs like a Feed-forward neural network and see these weights.
The assignment does not clearly discuss what weights are inside these matrices:

def rnn_cell_forward(xt, a_prev, parameters):
“”"
Implements a single forward step of the RNN-cell as described in Figure (2)

Arguments:
xt -- your input data at timestep "t", numpy array of shape (n_x, m).

a_prev -- Hidden state at timestep "t-1", numpy array of shape (n_a, m)

parameters -- python dictionary containing:
                    
                    Wax -- Weight matrix multiplying the input, 
                            numpy array of shape (n_a, n_x)
                    
                    Waa -- Weight matrix multiplying the hidden state, 
                            numpy array of shape (n_a, n_a)
                    
                    Wya -- Weight matrix relating the hidden-state to the output, 
                            numpy array of shape (n_y, n_a)
                    
                    ba --  Bias, numpy array of shape (n_a, 1)
                    
                    by -- Bias relating the hidden-state to the output, 
                            numpy array of shape (n_y, 1)

Can anyone please help me visualize where are these weights and how are these used, if we draw a neural network in a feed-forward fashion like we did in Course 1?

Hi there @deepakjangra

The shapes of the parameters are given, so I will explain what each one does:

Input Weights (W_{ax}) → These weights connect the input x^{(t)} at time step t to the hidden state a^{(t)}

Recurrent Weights (W_{aa}) → These weights connect the previous hidden state a^{(t-1)} to the current hidden state a^{(t)}

Output Weights (W_{ya}) → These weights connect the hidden state a^{(t)} to the output \hat{y}^{(t)}

Bias Term I (b_a) → This bias is for the hidden state.

Bias Term II (b_y) → This bias is for the output.

Hope this helps, feel free to ask if you need further assistance!

1 Like

In addition to Alireza’s explanation, it might help to look at the diagrams that are also included in the “Step by Step” notebook, e.g. this one which shows a single instance of the RNN cell:


Please map Alireza’s explanation onto that picture and it should all make sense.

2 Likes