{moderator edit - solution code removed}
I made the code above,and get the following results.
It seems to me that something is wrong with my code,and as a result,the attention weights is different from the right result,but I can’t tell where is the wrong point in my code. Please help.