In attention is all you need lesson, the use of feed forward network in the encoder and decoder module is not quiet clear. It will be helpful if someone can explain it clearly

If you take the NLP Specialization, it is explained in detail!

1 Like

I browsed through the course 4" Natural Language Processing with Attention Models" which seemed to be the most likely to tackle this question. But, based on the names of the videos, I am not sure it is discussed. Would you be able to provide the name of the course where the role of FFN is for transformer is discussed?

It should be the course with Attention Models, I think its the last course on the specialization!