I am not sure if anyone else got this problem, but I am having issues with the definition of the MHA function and the values it needs to get -
This is the definition -
self.mha = MultiHeadAttention(num_heads=num_heads, key_dim=embedding_dim, dropout=dropout_rate)
When I reviewed the documentation of “MultiHeadAttention” it stated that -
basically, the expected value of the first two arguments is an integer. I have tried to use the shape of X to call the function in many many ways and it’s always the same error.
In addition, after reading some of the threads on this exercise in discord the repeating comment is just use X for the function call but it just doesn’t work… is there any on who can shed some light on this function call ?
Thanks