C5W3 - Neural machine translation exercise: the Dot layer

realnoob · September 28, 2022, 3:25pm

I have a tiny question regarding the Dot layer in exercise 1 - one_step_attention: why the order of 2 tensors has to be [alphas, a]?

My understanding about this:
a.shape = (10, 30, 64)
alphas.shape: (10, 30, 1)
context.shape: (10, 1, 64)

so i think when we let context = dotor([alphas, a]), the third axis in alphas is broadcasted and multiplied with 64 values in the third axis of a, and then all values in the second axis is summed up. that’s why we get shape (10, 1, 64) for context.

if we let context = dotor([a, alphas]), the shape of context will be (10, 64, 1). what exact operations are done in the Dot layer to get this shape?

reinoudbosch · October 24, 2022, 12:42am

Hi realnoob,

Dot calls batch_dot where you can find the operations performed.

Topic		Replies	Views
C5 W3 A1 Neural Machine Translation Sequence Models coursera-platform	4	300	December 14, 2023
Help! modelf in DLS Course5 Machine Neural_machine_translation_with_attention_v4a Sequence Models coursera-platform	5	619	October 16, 2022
Typos (wrong variable name) in exercise "Neural Machine Translation" Sequence Models coursera-platform	1	515	November 11, 2021
Neural_machine_translation_with_attention_v4a Sequence Models coursera-platform	2	588	December 4, 2021
Week 4 - Exercise 3 - scaled_dot_product_attention Sequence Models coursera-platform	1	457	June 10, 2023

C5W3 - Neural machine translation exercise: the Dot layer

Related topics