Hi All,
I agree too. This last assignment is so confusing and I got really lost. The “self” lines are hard to understand. There should be better explanations.
Regards,
Carlos.
Hi All,
I agree too. This last assignment is so confusing and I got really lost. The “self” lines are hard to understand. There should be better explanations.
Regards,
Carlos.
Thank you FengF, I hit a brick wall on this assignment and the hint you found got me through the first line. From there I was able to finish the exercise.
“What is init and what is call?”
Yes, this is exactly what I spent the last hour working out!
Totally agree. Both the lectures and homework were well below the general standard set by the rest of this specialization.
The main takeaway I got from C5, W4 is that Transformers and self-attention are essential concepts to learn and understand, but that to do so, I will need to do my own research outside of this course.
Talk about stumbling across the finish line!
Hi all,
After doing some research I found the following link helpful in my understanding of week 4 especially self-attention. Hope this helps: Intuition Behind Self-Attention Mechanism in Transformer Networks - YouTube
Thank you
The assignment presumes you’re already fluent in creating python classes.
Felt completely lost in the Week 4 assignment. Unlike in the previous assignments, where proper guidance, hints and sometimes examples were given to help the students move towards the solution by themselves - This was clearly a push into the deep end and many of us sank without a trace!!
Speaking for myself (but i could see similar comments from other students as well), I had to entirely rely on the transformer tutorial provided in tensorflow.org to be able to do the assignment and hence complete the course.
Understood that transformers is not an easy topic and there are plenty of pieces and details to it. But fact be told, for the first time I felt that the lectures were inadequate and glossed over most of the details. And for those of us who were hoping to find solace in the guidance usually provided in the assignments, we were in for a bigger shock.
Having enrolled in and and completed other specializations from the deeplearning.ai team, Course 5 Week4 has clearly not lived up to the high standards set by the team and should be definitely set right at the earliest - A humble request from an ardent follower and long-standing student and patron of the deeplearning.ai courses.
After this Course, I am still not familier with TensorFlow , so I guess that’s why Pytorch becomes more popular now.
I love the course so far. However, I’d suggest to move the Week 4 entirely to the NLP specialization course or give us a basic implementation of the programming assignment to work with. Having completed all the preceding courses in the specialization, I personally think that this programming assignment is really hard.
Nothing else could better explain my frustration with this C5 W4 assignment than marcus-waldman’s comments. We have a perfect example of unbalancedness: hardest topic, shortest videos, longest exercise, most insufficient instructions.
This was excellent constructive feedback, @marcus-waldman. I am totally confused with this last piece of the entire specialization - I don’t think I managed to learn anything at all just by the lectures or the exercises.
Very baffling end to an otherwise excellent series of courses in the specialization.
Just offering another perspective.
Transformer architecture seems an exciting area where lots of new ideas come out. As a result, I would be thrilled to be offered a course at the same level of comprehensive with course about topics that are more mature.
Imaging yourself being a machine learning engineer, one of the job assignment might be to implement and train a model as complex as transformer, and it’s likely that all you have is some abstract high-level ideas. In this sense, I think the challenge that this week introduces to understand the content and complete the assignment is realistic.
Thank you for the additional labs. But the main assignment is still confusing (it is the 21st of April, 2022 when I’m writing this).
Here’re the problems I have and possible solutions.
The assignment description starts very well. But right before the first exercise, the notion of k comes out of nowhere. Yet, it doesn’t hurt much because there’s a hint how to deal with k, but, still, it would be good to have a brief explanation of what k is. Also, the exercise uses 10000 in the denominators of the angles whereas the lectures uses 1000. I used 1000 from the lectures and spent a lot of time figuring that. I understand that was solely my problem, but, still, could you emphasize that in the exercise’s description, please?
In exercise 2, the function positional_encoding has 2 arguments. In its body, we call the function get_angles that has 3 arguments. It would be good to have an explanation of that before the exercise. Also, the hint with np.newaxis is insufficient. It would be good to have an explanation of how np.newaxis gets into play in exercise 2.
Thank you.
Ivan
P.S. I see that people are complaining about the lectures as well. As of April 21, 2022, I’m satisfied with the lectures and think they are pretty good.
Hi Ivan,
Were you able to move foreward from ex 2 in this assignment?
I am still stuck and cannot find solutions in these forums, all is too confusing.
In ex 1, using 2i seems to work but then ex 2 doesnt. And according to the forum you should use 2(i//2) but that makes ex1 wrong…
I dont know what to do really
No, I haven’t yet, I only did ex.2 yesterday and plan to do ex.3 today or tomorrow.
Pertaining to your ex.1 question, I don’t know what k is. But we are given a hint that we must set i =k//2
. You don’t need to multiply i by 2 as you are suggesting in your post. The forum post that you’re referring to (that tells you to multiply by 2) is the old one, they have updated the assignment since then and introduced k. Nowadays, you don’t need to multiply by 2.
Also, pay attention that you need to use 10000 in ex.1. I used 1000 as was used in the lectures and spent several hours figuring that we are supposed to use 10000 in the exercise.
Apart from that, ex.1 is straightforward. You don’t even need to use numpy.power or other numpy algebraic functions. You can just use regular Python algebraic functions. I saw somewhere in the forum that we are supposed to cast d to np.float32
data type. I have no idea why we need to cast d. I tried both ways: with casting and without - it works both ways.
I just went through the grinding programming assignment on transformers and just wanted to re-emphasize the need for a teaching revamping on that topic. I could not breach that important threshold of content understanding and absorption necessary to extrapolate the Transformers topic knowledge into other fields/applications that I am dealing with.
In time, every single topic in this specialization (from course 1 to course 5-Week3) was awesome and I was able to transfer-learn all that content into my day-to-day activities. Transformes was the outlier here.
Hello, Rafael.
Welcome to the community.
Thank you and we all welcome your suggestions.
Transformer is an exhaustive topic and here the course has tried to provide the required ones. Well, we will direct your given suggestions to the staff and they can work on it to make it more effective and useful.
Hi, I’m stuck at the encoder. AssertionError: Wrong values case 1
I can’t seem to find any mistake in the call method. Thanks
Hello, Mark.
Welcome to the community.
Please share the error log that you are receiving from the cell.
Thanks.
Thanks for your response. I was able to fix the problem
Just wanted to echo @marcus-waldman great and very constructive critique.
I am writing this on the morning of Jan 3, 2023 (so happy new year to all fellow learners and course staff). I just completed and passed this final assignment of the specialization, but unlike the rest of the course which was outstanding, this last last assignment was not a good learning experience. I got through it by gathering hints on the forum and kinda hacking my way through, but I really don’t feel I learned implementation of the material well at all.
For what it’s worth, here are my observations and suggestions:
I thought the lectures were pretty good in explaining the very complicated subject of transformers
The coding assignment was really unhelpful. You introduce the use of Python object oriented programming and pretty complex tensorflow functionality with virtually no explanations or clarifications. I would rate myself as an advanced beginner in Python and a beginning beginner in tf (have completed a few modules of the deep-learning.ai tensor flow specialization). I could follow the Python OOP constructions, but had a hard time with tf. You create Python methods (some class level, some instance level) and attributes using tf objects, but give no explanation of how the Python methods/attributes interact with the tf objects in terms what parameters are necessary to pass to the Python methods and how these interact with the tf objects. I managed to grope my way through this by reading the forum and looking at error messages, but I didn’t really learn very much about why these constructs, how they interact(python methods with tf objects) and exactly where in the model diagrams and process flow they fit.
Suggestions: a) I think you should warn learners that before diving into this last transformer module, they need to be “fluent” ( as @TMosh said in one of his comments on this thread) in Python OOP and have at least a basic familiarity with tensorflow. For me, combining tf and Python OOP was really challenging to follow. b) You should provide a much richer and detailed explanation(with references for further learning) of the Python methods and attributes you construct, the tf objects they create, whether they are class or instance methods/attributes (why and how this affects implementation), what parameters are being passed by the Python methods to tf (how and why), and very detailed diagrams of how each of these fit in the overall model and process.
My plan is to finish the DLI tensor flow course( or perhaps another tf course ) and then come back to this last module and do it again.
I want to emphasize again that I thought the rest of the course was really outstanding. Thank you.