Deep RNN supplement addition of DNNs for the outputs


in the first week, last video called “deep RNNs”:

instead of the output y< t >, it is possible to add a non horizontal deep neural networks that leads to the prediction of y< t >. (see in blue what the professor added). t is for the given time step.
What are these non horizontal DNN ? MLPs ?
Screenshot from 2023-09-10 00-54-39

thank you very much

You’ve asked a sequence related question under convolution networks. Please do the following:

  1. Move this topic to the correct subcategory.
  2. Provide details on which (lecture & timestamp) / (assignment & cell) this question is related to.
  3. Rephrase the question since this is confusing:

instead of the output y < t > it is possible to add non horizontal deep neural networks before that make the prediction of each y< t >

Here’s the community user guide to get started.


Thank you for doing this.

To answer your question, please remember that there are 2 inputs to an RNN:

  1. Hidden state: Initial hidden state is to 0. After each timestep, you’ll pass the hidden state that the RNN outputs in the current timestep to the next timestep and so there’s no control over this.
  2. Input: This is provided by you to the RNN. Input features for each timestep is your decision to make. As long as future unknowns (e.g. y^{<t+1>}) aren’t part of input features to the current timestep, you are good to go.

thank you.
Unfortunately, I do not understand how this is answering my question. Prof NG says : “have a bunch of deep layers that are not connected horizontally but have a deep network here that then finally predicts y<1>.And you can have the same deep network here that predicts y<2>”.
These words are represented in blue in the attached picture.
What are these DNNs added at the end before making the final predictions y< t> ? Are they MLPs ?
Sorry for the confusion, if there is any here.

Sorry I misunderstood your question.
The layers after the last RNN are the typical dense layers for making predictions.

Didn’t you ask for this as well?

Hello Naini,

As you know in RNN, inputs goes through deep neural network to get the output y^t

In Deep RNN, you must have understood how the neural network again divided each hidden unit or layer at different timestep (also called hidden layers) which is again together from layer to layer lead you to get the output y^t.

To compute the value of one of the hidden unit in the hidden layer is by the activation function of input of hidden unit taken from the left side of the same layer as well as the input of hidden unit from below from the earlier layer.

So for RNN having 3 layers is already a lot because of temporal dimension (which is these hidden units passing through layer at different time steps)

So Prof Ng is mentioning you usually don’t see these RNN up to 100 layers.

But we do so recurrent layers stacked up, these deep layers are stacked up but are not connected horizontally === this sentences means

For a particular timestep passing through different hidden layer but are not interconnected to the other or adjacent time steps from the same hidden layer.

So a hidden unit at a particular timestep goes through different hidden layers without being connected to its same hidden layer but connected to that particular timestep leading to the output y^t.

For example output y^1 is the output from the different hidden layers passing through the same timestep 1.

Same goes for output y^2 is the output from the different hidden layers passing through the same timestep 2 and so on to n timestep which do not have horizontal (next timestep) connections.

Basically prof Ng is explaining different types of RNNs based on the computation.

Hope you understand!!!

You can ask if you still have any doubts!!!


fine, thank you very much for your response, all good now :slight_smile:

OK, thank you.