DeepLearning.AI
C5W4 Transformer multi-head weight matrices
Course Q&A
Deep Learning Specialization
Sequence Models
coursera-platform
anon57530071
June 30, 2022, 8:33am
5
Please see this
thread.
. Andrew’s intuition is sometimes incorrect from a math view point.
show post in topic
Related topics
Topic
Replies
Views
Activity
Learning q, k, v in self-attention and multihead attention
Sequence Models
coursera-platform
1
572
January 26, 2023
Is there an additional weight matrix layer for K,Q and V
Sequence Models
coursera-platform
9
427
August 16, 2023
W4 A1 | Is there a typo in Multi-head attention slides?
Sequence Models
coursera-platform
9
1437
November 10, 2022
C5 W4 multi-head attention
Sequence Models
coursera-platform
7
278
January 2, 2024
Course 5 Week 4 - Transformer Networks mechanics
Sequence Models
coursera-platform
1
508
April 21, 2022