C5W1_About the standard RNNs

In Course 5 Week 1, Andrew introduced the standard RNNs before introducing GRUs and LSTM. However, as I studied online, there are two types of standard RNNs: Elman networks and Jordan networks. I think the standard RNNs expressions in this course are the same with the Elman network. Is the standard RNNs here actually the Elman network (although Andrew didn’t explicitly state it)? Thank you.

Hey @ZikunTan,
I never heard the term Elman and Jordan with respect to RNNs, so, thanks for that. I checked out the Wikipedia page of RNNs to help me answer this question, which you can check out here.

In the “Architectures” section, you will find that Fully recurrent, LSTM, and GRU are listed as separate architectures than the Elman networks and Jordan networks, and I suppose Prof Andrew has discussed the former only in the course, leaving the latter, i.e., the standard RNNs taught by Prof are fully recurrent neural networks and not simple recurrent networks (another name for Elman and Jordan networks).

Now, under what conditions a fully RNN becomes a simple RNN, you may search it on the web. Let me know if this helps.

Cheers,
Elemento

Hey @ZikunTan,
@anon57530071 pointed out some possible discrepancies in my answer. Please give me some time to review my answer and update it. I will let you know once I am done.

Cheers,
Elemento

Hello Elemento:

I also got the concepts of Elman network and Jordan network from Wikipedia page. But my understanding is different: I think it means both Elman and Jordan networks are types of standard RNN. What Andrew introduced in the lecture is actually Elman network though he didn’t state it explicitly. Many literatures don’t mention the terms ‘Elman’ and ‘Jordan’ and just say ‘standard RNN’. I am not fully sure whether my understanding is correct but it makes sense for me. Anyway, any different understandings are welcomed, thanks.

Also, I compared the Elman network expressions in Wikipedia and I found they are exactly the same with the standard RNN expressions that Andrew gave in the lecture, so that’s why I think the standard RNN given in this course is just Elman network.

Hey @ZikunTan,
The majority of resources that I am finding online seem to be about LSTM and GRU, and a few of them which discusses the “RNN”, are using the terms “simple”, “standard” and “fully” interchangeably.

Here, by “standard”, I am assuming you are referring to “fully recurrent”. If so, then yes indeed, as being also highlighted on the Wikipedia page:

Fully recurrent neural networks (FRNN) connect the outputs of all neurons to the inputs of all neurons. This is the most general neural network topology because all other topologies can be represented by setting some connection weights to zero to simulate the lack of connections between those neurons.

Once again, completely agreed with the above point as well. Also, if you take a look at Figure 3 of this blog, you will find that it also resembles exactly to the RNN that Prof Andrew introduces in the beginning of the course. I am ignoring the Wikipedia diagram here, since it looked a bit confusing to me.

So, in my opinion, you are correct that Prof Andrew teaches Elman network, without explicitly stating it.

However, I guess both of us will agree here on the fact that it’s better that Prof Andrew didn’t mention this explicitly, since it will lead to unnecessary confusion for learners who are just starting out with the world of sequence models.

I hope this helps.

Cheers,
Elemento

1 Like

Hello Elemento, many thanks for your patient answering, which helps me a lot.

I completely agree with your answer. Yes, what Andrew taught in the beginning of the lecture is actually so-called ‘Elman networks’ without stating it. In many materials, ‘fully’, ‘standard’ and ‘traditional’ are used interchangeably but they in fact point to the same thing. It is unnecessary for this course to provide all details since it is an introductory course.

Yours,
Zikun Tan