Can some explain each choice in this question? Thank you!
I’m not sure how to explain this but let me share you some hints.
- Try write down the formula on paper and replace Γr, Γu with the constant 0, 1 in each choice and see the the final formula.
- Is the Varnishing Gradient Problems likely to occur with the final formula ? Try lookup again on the definition of this problem. hint: especially C(t), C(t-1) part
For gate_u, I think if gate_u is 1 then it means we do not remember the value from the previous time. If gate_u is 0 then it means we will remember the value from the previous time.
I don’t quite understand the purpose of gate_r. Could you explain it a little bit further? Thank you.
- gate_u/update gate/Γu is the gate that tell amount of information from c(~t) that we want to use.
- gate_r/reset gate/Γr is the gate that tell amount of information from the c(t-1) we want to ignore.
- Γ ≈ 0, Γ ≈ 1 in the choice mean that Γ value is approximately close to that 0 or 1. removing value setting is equal assigning 1 to it automatically.
- If gate_u is 1, it means that we want all information to be remembered and use, if gate_u is 0 it means that , in candidate memory cell c(~t), there is no information that we want to use. See Γu * c(~t) in c(t) equation , if Γu ≈ 1, it means that c(t) will use almost all information from c(~t).
- If gate_r is 0 it means that we want to ignore all information from previous memory cell c(t-1) and if it’s 1, it means that there’s no information that we want to ignore. See Γr * c(t-1) in the c(~t) equation, if Γr ≈ 0, it means that we will use a very tiny part of value of c(t-1) .
1 Like