Hi Mentor,
In this transformer architecture assignment, in the encoder layer why its not happening weight matrix multiplication like Q=WX, K=WX. can you please help to understand why not happening ?
Hi Mentor,
In this transformer architecture assignment, in the encoder layer why its not happening weight matrix multiplication like Q=WX, K=WX. can you please help to understand why not happening ?
Already answered in another reply.
Sir, can you please share the answer ?
It was my reply to one of your other posts.
This one: