Week 3 assignment 1: salient questions about what we're doing

So after completing the lab I still have one question of how all of this is working:
one_step_attention:
When I see the layers densor1 = Dense(10, activation = "tanh") and
densor2 = Dense(1, activation = "relu"), I was expecting to see outputs of dimensions (10, 1) and (1,1) - basically matching the number of hiden units. But respectively I see (10, 30) and (1, 30). I wonder why is this the case. I’m guessing is coming from the input dimensions, which are (30, 128) from the concatenator and (10, 30) from densor1, but I’m still slightly mystified and wondering where is this behavior coming from

1 Like

Do you need any help with this issue?

I’m just trying to understand how it works

As I understand, concatenation yields (m, 30, 128) → this propagates through Densor1 to yield (m, 30, 10) → which then propagates through Densor2 to yield (m, 30, 1) - which means, one scalar value for each Tx (=30), i.e. one weight for each timestep.