So after completing the lab I still have one question of how all of this is working:
one_step_attention
:
When I see the layers densor1 = Dense(10, activation = "tanh")
and
densor2 = Dense(1, activation = "relu")
, I was expecting to see outputs of dimensions (10, 1) and (1,1) - basically matching the number of hiden units. But respectively I see (10, 30) and (1, 30). I wonder why is this the case. I’m guessing is coming from the input dimensions, which are (30, 128) from the concatenator and (10, 30) from densor1, but I’m still slightly mystified and wondering where is this behavior coming from
1 Like
Do you need any help with this issue?
I’m just trying to understand how it works
As I understand, concatenation yields (m, 30, 128)
→ this propagates through Densor1
to yield (m, 30, 10)
→ which then propagates through Densor2
to yield (m, 30, 1)
- which means, one scalar value for each Tx (=30), i.e. one weight for each timestep.