Hello, although they would be equivalent in training, i would like to state that the use of Gate-U in the formulas for GRU, is different in the video and slides. The formula makes more sense since the update coefficient is multiplied with the candidata h<t+1>.
Hi onertan,
Thanks for catching this! In the current version of the video and slides this seems to have been resolved in line with your suggestion.
But in the PDF reading, the h^<t_0> and h’^<t_1> are still transposed. The formula and the diagram do not agree.
Hi karencfisher,
Thanks for reporting this! I had not checked the reading item. I will report this to the people working on the backend.