Backpropagation in RNN weight sharing

Mohammed_Ifreen · February 23, 2022, 6:26am

I have a few doubts which I will put forth below.

1.) What do we mean when we say RNN shares weight? Is it that all 3 input matrix is same across all time steps? However when we back propagate, This gets changed right? All weight matrix at each time step will get updated based on its own gradient? If so, How is it weight sharing?

vsnupoudel · February 23, 2022, 5:14pm

Well, we use a bunch of RNNs or LSTMs in each layer, like 16,32, or more… So, we do have a bunch of different sets of weigts.
And yes, each RNN, LSTM cell shares the weights across the time steps. The answer to this is detailed in the most upvoted answer here…

paulinpaloalto · February 23, 2022, 8:45pm

I have not read the StackExchange article yet, but I think there’s a fairly clear way to state the answer:

Yes, there is one set of weights and they are used in all time steps. When the gradients are applied during back propagation, there may be different gradient values for each time step, but they are applied to the same shared set of weights, right? So you still end up with a single shared set of weights after applying back propagation.

Of course the exact number and structure of the weights depends on what features you implement in your particular RNN setup (LSTM or not and the various other choices). But with a given architecture, the description above applies.

Mohammed_Ifreen · February 23, 2022, 9:14pm

Hello sir,

I lost you at the point where you said applied to same shared set of weights. My doubt is, Once we update the weights based on their gradients, We end up having different weight matrices for each timestamp. So how is it shared when the weights are different after one step of back propagation?

paulinpaloalto · February 23, 2022, 10:02pm

The point is that the gradients are applied to the same set of weights. Each time step contributes a different gradient, but they are all applied to the same set of weights.

Topic		Replies	Views
Backprop with shared layers Sequence Models coursera-platform	1	544	June 25, 2021
C5W1 A3: Keras - Shared layers / shareable weights Sequence Models coursera-platform	2	550	April 18, 2022
[Week 1] How are the weights updated in backpropagation thorough time? Sequence Models coursera-platform	12	908	July 15, 2023
Reusing weights question Sequence Models coursera-platform	5	67	June 23, 2024
Do every RNN cell in a network have the same weights? Sequence Models coursera-platform	1	548	April 17, 2022

Backpropagation in RNN weight sharing

Related topics