[C5W1A1] wrong results of lstm_backward

wziz · May 23, 2021, 7:46am

I got the wrong values of results from the function lstm_backward. But the shapes of the results and all other corresponding functions are passed.

Below are my code and results. I’ve been stuck here for 2 days, please help me.

def lstm_backward(da, caches):
    (caches, x) = caches
    (a1, c1, a0, c0, f1, i1, cc1, o1, x1, parameters) = caches[0]

    n_a, m, T_x = da.shape
    n_x, m = x1.shape

    dx = np.zeros((n_x, m, T_x))
    da0 = np.zeros((n_a, m))
    da_prevt = np.zeros((n_a, m))
    dc_prevt = np.zeros((n_a, m))
    dWf = np.zeros((n_a, n_a + n_x))
    dWi = np.zeros((n_a, n_a + n_x))
    dWc = np.zeros((n_a, n_a + n_x))
    dWo = np.zeros((n_a, n_a + n_x))
    dbf = np.zeros((n_a, 1))
    dbi = np.zeros((n_a, 1))
    dbc = np.zeros((n_a, 1))
    dbo = np.zeros((n_a, 1))

    for t in reversed(range(T_x)):
        gradients = lstm_cell_backward(da[:,:,t] + da_prevt, dc_prevt, caches[t])
        da_prevt = gradients['da_prev']
        dc_prevt = gradients['dc_prev']
        dx[:,:,t] = gradients['dxt']
        dWf += gradients['dWf']
        dWi += gradients['dWi']
        dWc += gradients['dWc']
        dWo += gradients['dWo']
        dbf += gradients['dbf']
        dbi += gradients['dbi']
        dbc += gradients['dbc']
        dbo += gradients['dbo']

    da0 = da_prevt

    gradients = {"dx": dx, "da0": da0, "dWf": dWf,"dbf": dbf, "dWi": dWi,"dbi": dbi,
                "dWc": dWc,"dbc": dbc, "dWo": dWo,"dbo": dbo}

    return gradients

resuts:

gradients["dx"][1][2] = [ 0.01034214  1.03473735 -0.2398793  -0.43281115]
gradients["dx"].shape = (3, 10, 4)
gradients["da0"][2][3] = 0.5883931290038376
gradients["da0"].shape = (5, 10)
gradients["dWf"][3][1] = -0.02269017674887574
gradients["dWf"].shape = (5, 8)
gradients["dWi"][1][2] = 0.6099853844261891
gradients["dWi"].shape = (5, 8)
gradients["dWc"][3][1] = -0.013857139274558946
gradients["dWc"].shape = (5, 8)
gradients["dWo"][1][2] = 0.04772920545685257
gradients["dWo"].shape = (5, 8)
gradients["dbf"][4] = [-0.199665]
gradients["dbf"].shape = (5, 1)
gradients["dbi"][4] = [-0.7340795]
gradients["dbi"].shape = (5, 1)
gradients["dbc"][4] = [-0.56981661]
gradients["dbc"].shape = (5, 1)
gradients["dbo"][4] = [-0.24499124]
gradients["dbo"].shape = (5, 1)

TMosh · June 27, 2021, 6:19am

Do you still need help with this issue?

arosacastillo · July 25, 2021, 2:22pm

Hi wziz,

What I see is that you have dx with 3 dimensions instead of dxt with two dimensions.

Best,

Rosa

muly_yahav · December 26, 2021, 11:54am

Hey
Have the exact same result.
Can you help with the pinpointing the issue?

Also, it’s kind of assumed the the last time step derivative dc_next is zero (which is the first entry into the lstm_cell_backward for the parameter dc_next). why is that?

Santiago_Duran · January 5, 2022, 4:29pm

I have the same code but I get the correct results… I would say that the problem begins in the previous function lstm_cell_backward (which was very painful to code, actually)

Rashmi · June 23, 2022, 7:57am

Hi Santiago,

Welcome to the community.

Yes, you also need to keep in mind the framework that you use for back propagation. You need to start with sigmoid followed by tanh later.

Witenberg · July 26, 2022, 7:32pm

My problem was at a_next initialization should be a_next = a0
def lstm_forward(x, a0, parameters):
…

Initialize a_next and c_next (≈2 lines)

a_next = a0 # <------ CHANGE HERE
c_next = np.zeros((n_a,m))

# loop over all time-steps
for t in range(T_x):
    # Get the 2D slice 'xt' from the 3D input 'x' at time step 't'
    xt = x[:,:,t]
    # Update next hidden state, next memory state, compute the prediction, get the cache (≈1 line)
    a_next, c_next, yt, cache = lstm_cell_forward(xt, a_next, c_next, parameters)

…

sonnh1902 · December 1, 2022, 5:56pm

My code is the same as yours and I got the expected output. Are you sure you are running the cells in order, which is from the top to the bottom of the notebook?

Thomas_A_W · May 14, 2023, 8:55am

I was running into a similar problem and what I found was that one of the calculations of my lstm_cell_backward() functions was incorrect. I carefully went back and compared the lstm_cell_backward() output with the expected output and found that one of the values was incorrect. After fixing that value then lstm_backward() started working correctly.

Kevin_Salvador_Aguil · July 19, 2023, 3:32pm

I had the same problem but it was in the previews code. you should review again the lstm_cell_backward I just made a mistake in one value and it was the problem
I had dc_prev = {moderator edit}
instead of: dc_prev = {moderator edit}

i share you all my code

{Moderator Edit: Solution Code Removed}

saifkhanengr · July 19, 2023, 3:36pm

And I delete all your code. This is totally unacceptable to share your code to help other learners. This will lead to suspending your account. So, do not share your code as this is against the community Honor Code.

Kevin_Salvador_Aguil · July 19, 2023, 4:19pm

I’m apologize, I don’t do that again

Jiaquan_He · July 29, 2023, 1:11pm

My case is similar to @Witenberg . My lstm_forward was wrong although it passed the tests.

I accidentally initialize c_next to be zeros like part of c like

a_next = a0
c_next = c[:, :, T_x] # <== Wrong

But as mentioned in the notebook, setting one variable equal to the other is a “copy by reference”. So as the loop iterates, the c_next and c are totally messed up.

Ciel_Sun · June 22, 2024, 3:50am

This is it, I also set c_next as a reference to the c matrix/array, i.e. c_next = c[:,:,0] in the lstm_forward. After just create a newly initialized variable the answer is correct in the last section lstm_backward.

2017mooc · August 2, 2024, 5:39pm

Good tip! The problem was that in my lstm_cell_backward() the result was equal to the given solution; but later, in the lstm_backward() I had a mismatch with dx.shape… Thanks!

Topic		Replies	Views
Lstm_backward wrong output Sequence Models coursera-platform	13	1171	February 29, 2024
C5W1 Exercise 8 lstm_backward Sequence Models coursera-platform	3	699	January 22, 2024
Lstm_backward wrong results Sequence Models coursera-platform	5	531	April 19, 2023
C5W1 A1 (Ex8) lstm_backward, dc_next missing? Sequence Models coursera-platform	13	740	May 4, 2023
Seq models wk1 assign1 lstm_backward output mismatch Sequence Models coursera-platform	1	723	October 29, 2021

[C5W1A1] wrong results of lstm_backward

Initialize a_next and c_next (≈2 lines)

Related topics