C5 W3 A1 Neural Machine Translation

Agnieszka_Sawicka · December 10, 2023, 8:46pm

Hi, I am stuck with dense layers dimensions in one_step_attention(). Concretely:

Densor1 has 10 units (neurons), which means that the output consists of 10 values
Densor2 has 1 unit, which means that the output is 1 value
The output of densor2 is passed over to activator which produces alphas.

I know there must be many alphas (not just one) but how is it possible if the output of densor2 is just one value?

TMosh · December 10, 2023, 9:03pm

I’m not sure what you mean by “alphas” here, or why you say there must be more than one.

Agnieszka_Sawicka · December 10, 2023, 9:08pm

Re alphas: I am referring to this part of one_step_attention(): alphas = activator(energies).
Why am I assuming there must be many alphas? Because we later use dotor which makes a sum (a dot product, in fact) of many a’s and alphas in accordance with that equation

TMosh · December 10, 2023, 9:40pm

The “alphas” are just the input the the softmax layer.
The output is a probability for each value of ‘t’.

But I understand your question - softmax is typically only used when there are multiple outputs, and you want to re-scale them so they sum to 1 for each example.

Maybe what they’re doing is, since the previous layer uses ReLU activation, they’re using softmax to just re-scale it to the range 0 to 1.

Agnieszka_Sawicka · December 14, 2023, 11:40pm

Info for other learner who may struggle with the topic above: look very carefully into the dimensions of every and each layer in the function one_step_attention() It will be very teaching to discover how those dimensions change from layer to layer.

Topic		Replies	Views
Week 3 assignment 1: salient questions about what we're doing Sequence Models coursera-platform	3	546	July 31, 2022
C5W3 - Neural machine translation exercise: the Dot layer Sequence Models coursera-platform	1	523	October 24, 2022
Question about attention weights Sequence Models coursera-platform	3	671	July 31, 2022
Help! modelf in DLS Course5 Machine Neural_machine_translation_with_attention_v4a Sequence Models coursera-platform	5	619	October 16, 2022
C5w3 A1 modelf() Sequence Models coursera-platform	3	611	February 19, 2022

C5 W3 A1 Neural Machine Translation

Related topics