Course 5: Week 4 Transformer DecoderLayers

TMosh · July 15, 2021, 6:31pm

# BLOCK 1
        # calculate self-attention and return attention scores as attn_weights_block1 (~1 line)
        attn1, attn_weights_block1 = self.mha1(x, x, x, look_ahead_mask, return_attention_scores=True)

# BLOCK 2
        # calculate self-attention using the Q from the first block and K and V from the encoder output.
        # Return attention scores as attn_weights_block2 (~1 line)
        attn2, attn_weights_block2 = self.mha2(out1, enc_output, enc_output, padding_mask, return_attention_scores=True)

Topic		Replies	Views
Programming Assignment: Transformers Architecture with TensorFlow-Exercise 6 - DecoderLayer Deep Learning Resources	4	370	July 28, 2025
Weak4-Assignment 1 Sequence Models coursera-platform	1	434	July 19, 2023
Course 5, Week 4, Transformer Sequence Models coursera-platform	4	987	July 27, 2021
C5_W4_A1_UNQ_C6 Decoder Layer Sequence Models coursera-platform	4	865	August 4, 2021
Transformers Programming Assignment DecoderLayer self.mha2 invoice issue NLP with Attention Models week-module-2	1	337	November 8, 2023

Course 5: Week 4 Transformer DecoderLayers

Related topics