Understanding an RNN


In a feed forward neural network, we can have l layers with n neurons in each layer. In RNN, if there are 2 layers and 10 neurons in each layer. Suppose if the length of the longest sentence is 10 words, How are the 10 time steps fed to each neuron in the hidden layer?


The architecture of an RNN is fundamentally different from a Feed Forward net or a ConvNet. There is only one “cell” which is used to process every timestep. That cell has what is called the “hidden state” which is everything it needs to keep track of in order to make whatever predictions you need the system to make. You can think of that hidden state as the “neurons” by analogy to the other types of networks. At each timestep, the cell gets two inputs: the new word or character for the new timestep and the previous value of the hidden state that was output by the previous timestep.

Prof Ng describes all this in the lectures in great detail. There are also lots of different types of sequential input and output that an RNN can be set up to handle.

You will also learn about ways to make the RNN cell more complex by adding GRU logic or LSTM logic. Prof Ng will describe all that in Week 1 of Course 5.