I’m trying to write an essay in school about MLSA, which is used in deepseek r1 model. I have basic knowledge of a transformer, but when I look at the research papers, I can not understand
is there a course on Coursera that talks about MLSA, general MSA or transformer architecture? I think I need to be more prepared.
Thanks for your time!
Deep Learning Specialization and Natural Language Specialization from DeepLearning.AI explain the transformer architechture.
1 Like
Also, there is a free Short Course that discusses how Transformers work.
2 Likes
Thanks for the suggestions! I’ll look it up.
1 Like
Thanks for the link!