A perspective for C5W4A1 EX4

Hi,

Having gone through this assignment, I’d love to share my perspective which would help other students.

When trying to solve this exercise, the following things should be gone through, step-by-step. Firstly,
image
This diagram has been provided. Please go through it carefully, merely by understanding the flow, 50% of problem is solved already.
The next section you should look closely is this:


Then next:

And finally, read carefully the helper comments just before each line of code to be written.
I feel if students follow this flow, it’ll be much easier. Part of the problem lies in the fact that:

  1. Lack of understanding about the flow, i.e., which part is supposed to go where, etc.
  2. Confusion regards how to call the functions
    One more hint: for the first line, it goes like this:
    self.mha(parameter_1=<input variable>, parameter_2=<input variable>, parameter_3=<input variable>, mask_parameter=<mask variable>)

Further, before going for coding, first take a screenshot of the diagram and keep it to the side. Then, as you write code, keep referring to the diagram so that the correct flow is in mind. When I was solving this assignment, this last aspect was the barrier between me and success.
Hopefully this is of help to someone.

Reference: C5_W4_A1 DON'T PANIC! Transformer help is on the way

I’d suggest going through the reference first and then my perspective. That’s just my suggestion; you could do whatever you like first :smile:

4 Likes

Thanks for your recommendations.

1 Like

Thanks for sharing.

1 Like

On first pass through the assignment, I somehow skipped over the second paragraph you highlighted here (the one that that explained the arguments that need to be passed to the self.mha() function). Your post reminded me to go back and re-read this carefully. And that got me unstuck. Thanks!

(Personally, I also found it helpful to read this documentation, which explained what the call() function does. After reading that, I was able to understand the rest of the keras layer documentation, including MultiHeadAttention.)

1 Like