Hi everyone,
My implementation is nearly correct except for the shape of dba.
gradients[“dxt”][1][2] = -1.3872130506020925
gradients[“dxt”].shape = (3, 10)
gradients[“da_prev”][2][3] = -0.15239949377395495
gradients[“da_prev”].shape = (5, 10)
gradients[“dWax”][3][1] = 0.4107728249354584
gradients[“dWax”].shape = (5, 3)
gradients[“dWaa”][1][2] = 1.1503450668497135
gradients[“dWaa”].shape = (5, 5)
gradients[“dba”][4] = 0.2002349138798539
gradients[“dba”].shape = (5,)
Which should be (5,1)
My computation is dba = dtanh.sum(axis=1)
What should I be doing to have the correct dimensions?