Want to understand multi head latent self attention (MLSA) in transformer, which course to pick?

Lee_AI · March 10, 2025, 10:37am

I’m trying to write an essay in school about MLSA, which is used in deepseek r1 model. I have basic knowledge of a transformer, but when I look at the research papers, I can not understand
is there a course on Coursera that talks about MLSA, general MSA or transformer architecture? I think I need to be more prepared.
Thanks for your time!

gent.spah · March 10, 2025, 2:44pm

Deep Learning Specialization and Natural Language Specialization from DeepLearning.AI explain the transformer architechture.

TMosh · March 10, 2025, 3:16pm

Also, there is a free Short Course that discusses how Transformers work.

Lee_AI · March 10, 2025, 3:57pm

Thanks for the suggestions! I’ll look it up.

Lee_AI · March 10, 2025, 3:57pm

Thanks for the link!

Topic		Replies	Views
✨ New course! Enroll in Attention in Transformers: Concepts and Code in PyTorch News and Announcements short-course , learning-platform	3	382	February 17, 2025
Attention is all you need GenAI with LLMs Resources	0	522	July 27, 2023
Something is wrong in the Decoder Block (of the Week2 ): Contradiction with the paper "Attention is all you need" NLP with Attention Models week-2	6	699	January 31, 2022
Transformers architecture - Week 1 \| Coursera Generative AI with Large Language Models week-1	1	997	December 2, 2023
Very excited to join this course and nice to meet you all AI Discussions	1	40	July 1, 2023

Want to understand multi head latent self attention (MLSA) in transformer, which course to pick?

Related topics