Hi, I have a problem of finding out what number of units to use in tl.Dense()
According to parameters I should use d_ff because (d_ff (int): depth of feed-forward layer ). However if I use d_ff in both tl.Dense(d_ff) it is causing 8 Test cases to pass and 2 Test Cases to fail.
But when I use d_ff in first tl.Dense(d_ff) and d_model in second tl.Dense(d_model) it passes all Test cases.
So my problem is that it is not given that d_model to be used as number of units for second tl.Dense() then why using it passes all test cases while using d_ff for both Dense layers results in 8 passed and 2 failed test cases.