In the Gated Recurrent Units Reading material, the h^<t1>
equation is incorrect. It should be:
h^<t1> = (1 - Tao_u) * h^<t0> + Tao_u * h^'<t1>
The equation in the video before this reading is correct.
In the Gated Recurrent Units Reading material, the h^<t1>
equation is incorrect. It should be:
h^<t1> = (1 - Tao_u) * h^<t0> + Tao_u * h^'<t1>
The equation in the video before this reading is correct.
I would argue that it depends… The documentation is inconsistent, for example, deep learning frameworks use the formula from the reading material:
h_t = \Gamma_u * h_{(t-1)} + (1-\Gamma_u) * \tilde h_t
also the book I would recommend reading:
On the other hand, Wikipedia article uses the formula from the video:
Okay. I was reading the GRU from this document (under Variants on Long Short Term Memory), and it is consistent with the Wikipedia verision. Thanks.