C4W1_Assigment_Exercise 3 - Decoder

Maria_Petrovskaya · January 9, 2024, 7:13am

Hello! I suppose that “logits” in some relationships with “softmax layer”, but need more information.

arvyzukai · January 9, 2024, 8:13am

What do you mean?

In this exercise you’re asked to use tf.nn.log_softmax activation for the last layer to get “log probabilities”. In other words, NOT softmax activation (which result in [0.0…1.0] values) which imitate “probabilities”.

Is this the relationship you’re asking about?

Regards

Maria_Petrovskaya · January 9, 2024, 8:21am

Thank you for reply!
In “def init” … activation=tf.nn.log_softmax(vocab_size-?) (logits as argument in library)
In “call”…# Compute the logits
logits = context - self.output_layer(context -?)
I do not understand how to connect this part…

arvyzukai · January 9, 2024, 8:31am

It’s just activation=tf.nn.log_softmax (no need to call it with any parameters).

to get logits (the more correct variable name should have been log_probs) you just call the self.output_layer(x) (no need for context - the only placed it is used in the Decoder is in Cross attention).

Maria_Petrovskaya · January 9, 2024, 9:01am

With these parameters I got error… Where may be problem?
AttributeError: Exception encountered when calling layer ‘decoder_15’ (type Decoder).

‘Decoder’ object has no attribute ‘LSTM’

Call arguments received by layer ‘decoder_15’ (type Decoder):
• context=tf.Tensor(shape=(64, 14, 256), dtype=float32)
• target=tf.Tensor(shape=(64, 15), dtype=int64)
• state=None
• return_state=False

arvyzukai · January 9, 2024, 9:10am

I guess somewhere in the code you used self.LSTM(..) or similar? The decoder does not have this attribute. According to the skeleton code you were provided, it should only have these class attributes:

embedding,
pre_attention_rnn,
attention (where you should use the context),
post_attention_rnn (this is your LSTM)
and output_layer

And you only need these in the call(..)

Maria_Petrovskaya · January 9, 2024, 9:30am

Thank you!
And what expected as "initial_state"a parameter? (vector? bool?) in “self.pre_attention_rnn(x, initial_state=”

arvyzukai · January 9, 2024, 9:35am

It defaults to None and you don’t need it your case. In other words, you only need x for “pre_attention_rnn”.

Maria_Petrovskaya · January 9, 2024, 10:06am

Thank you!
After all corrections Value error exist… where is ‘decoder_25’, “lstm_51” --inside library?

"—> 65 x, hidden_state, cell_state = self.pre_attention_rnn(x, initial_state=None)
66
67 # Perform cross attention between the context and the output of the LSTM (in that order)

ValueError: Exception encountered when calling layer ‘decoder_25’ (type Decoder).

Input 0 of layer “lstm_51” is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: (64, 14, 256, 256)

Call arguments received by layer ‘decoder_25’ (type Decoder):
• context=tf.Tensor(shape=(64, 14, 256), dtype=float32)
• target=tf.Tensor(shape=(64, 15), dtype=int64)
• state=None
• return_state=False"

arvyzukai · January 9, 2024, 10:10am

You are probably embedding the context and not the target (which you should in the decoder’s case) in the first line of code (“# Get the embedding of the input”).

Maria_Petrovskaya · January 9, 2024, 10:18am

Great help! Thank you

RLSK · August 13, 2024, 5:12pm

Hi! Sorry to hop in here after so many months, but if we can leave initial_state = None then why do the directions say “# - Pass in the state to the LSTM (needed for inference)” what is that "state"referring to?

RLSK · August 13, 2024, 5:15pm

Never mind I think I get it!

Topic		Replies	Views
C4W1 Exercise4 Translator NLP with Attention Models week-module-1	3	301	January 23, 2024
NLP C4W1 Exercise 3 Decoder NLP with Attention Models week-module-1	2	453	January 12, 2024
C4W1_Assignment exercise 3 - decoder NLP with Attention Models week-module-1	4	281	May 24, 2024
C4_W1 assignment NMT NLP with Attention Models week-module-1	3	81	August 7, 2024
C2_W2_SoftMax lab Advanced Learning Algorithms week-module-2	5	235	March 20, 2024

C4W1_Assigment_Exercise 3 - Decoder

Related topics