Week2 Emojifier-V2 Model Architecture

Looja_Tuladhar · May 4, 2023, 6:04pm

There is no explanation in the assigment for why we are using this particular architecture of units for the project. In particular why do we need two layers of LSTM connected by a dropout layer? Why would a single LSTM layer not be enough to create the network? I cannot figure this out based on the lecture or the assignment. Does anyone here know?

canxkoz · May 4, 2023, 6:39pm

Hello @Looja_Tuladhar ,
Welcome to the discouse community. Thanks a lot for posting your question here. I will do my best to give a reply to your question.

The reason for using two layers of LSTM connected by a dropout layer is to prevent overfitting and improve the generalization of the model. Dropout is a regularization method that adds noise to the neurons to avoid over-reliance on any specific neuron. Using a single LSTM layer may not be enough to capture the complexity of the data and may lead to overfitting. Adding a dropout layer between the LSTM layers helps the model generalize better and avoid overfitting.

In the last part of the assignment, It is mentioned that Dropout() is used to regularize the LSTM() model. Regularization is a technique used in neural networks to reduce overfitting and improve generalization ability and that is why Dropout() is used between two LSTM layers.

Above is the screenshot of the last part of the assignment.
Please do not hesitate to send a followup question. I am hoping that I was able to answer your question,
Best,
Can

Topic		Replies	Views
GRU & LSTM: why not simply use skip connections? Sequence Models	3	684	January 9, 2023
Why multiple LTSM layers for encoder and only one LSTM for decoder? NLP with Attention Models week-1	1	552	September 21, 2022
Embedding layer, why is it needed? NLP with Sequence Models week-1	4	884	September 12, 2022
Week3 - Trigger word detection - Why do we need 2 GRU layers Sequence Models	1	492	July 8, 2022
Why there are two feed forward layers? NLP with Attention Models week-2	1	484	April 30, 2023

Week2 Emojifier-V2 Model Architecture

Related topics