Week 3 assignment 1: salient questions about what we're doing

tyatabe · November 4, 2021, 8:16pm

So after completing the lab I still have one question of how all of this is working:
one_step_attention:
When I see the layers densor1 = Dense(10, activation = "tanh") and
densor2 = Dense(1, activation = "relu"), I was expecting to see outputs of dimensions (10, 1) and (1,1) - basically matching the number of hiden units. But respectively I see (10, 30) and (1, 30). I wonder why is this the case. I’m guessing is coming from the input dimensions, which are (30, 128) from the concatenator and (10, 30) from densor1, but I’m still slightly mystified and wondering where is this behavior coming from

TMosh · November 8, 2021, 7:23am

Do you need any help with this issue?

tyatabe · November 10, 2021, 7:20pm

I’m just trying to understand how it works

Uditgt · July 31, 2022, 5:42am

As I understand, concatenation yields (m, 30, 128) → this propagates through Densor1 to yield (m, 30, 10) → which then propagates through Densor2 to yield (m, 30, 1) - which means, one scalar value for each Tx (=30), i.e. one weight for each timestep.

Topic		Replies	Views
C5 W3 A1 Neural Machine Translation Sequence Models coursera-platform	4	296	December 14, 2023
C5w3 A1 modelf() Sequence Models coursera-platform	3	609	February 19, 2022
Question about attention weights Sequence Models coursera-platform	3	666	July 31, 2022
C5 W3 A1 NMT with attention v4 Sequence Models week-module-3 , coursera-platform	1	217	February 12, 2024
Neural_machine_translation_with_attention Ex2 Error in test. The lists contain a different number of elements Sequence Models coursera-platform	13	870	June 5, 2023

Week 3 assignment 1: salient questions about what we're doing

Related topics