Help Understanding Rnn Backprop Exersize Week 1 Assignment 1

sjc · August 11, 2023, 5:09pm

In the coding exercise attatched below, I dont understand why we load the parameters Wya and by but we never calculate their gradients.

def rnn_cell_backward(da_next, cache):
“”"
Implements the backward pass for the RNN-cell (single time-step).

Arguments:
da_next -- Gradient of loss with respect to next hidden state
cache -- python dictionary containing useful values (output of rnn_cell_forward())

Returns:
gradients -- python dictionary containing:
                    dx -- Gradients of input data, of shape (n_x, m)
                    da_prev -- Gradients of previous hidden state, of shape (n_a, m)
                    dWax -- Gradients of input-to-hidden weights, of shape (n_a, n_x)
                    dWaa -- Gradients of hidden-to-hidden weights, of shape (n_a, n_a)
                    dba -- Gradients of bias vector, of shape (n_a, 1)
"""

# Retrieve values from cache
(a_next, a_prev, xt, parameters) = cache

# Retrieve values from parameters
Wax = parameters["Wax"]
Waa = parameters["Waa"]
Wya = parameters["Wya"]
ba = parameters["ba"]
by = parameters["by"]

### START CODE HERE ###
# compute the gradient of dtanh term using a_next and da_next (≈1 line)
dtanh = None

# compute the gradient of the loss with respect to Wax (≈2 lines)
dxt = None
dWax = None

# compute the gradient with respect to Waa (≈2 lines)
da_prev = None
dWaa = None

# compute the gradient with respect to b (≈1 line)
dba = None

### END CODE HERE ###

# Store the gradients in a python dictionary
gradients = {"dxt": dxt, "da_prev": da_prev, "dWax": dWax, "dWaa": dWaa, "dba": dba}

return gradients

TMosh · August 11, 2023, 5:23pm

In the thread title, please identify the week number and assignment number.
For example “C? W? A?”.

You can add this to the thread title using the “pencil” icon.

sjc · August 11, 2023, 5:34pm

sorry about that, I added it

TMosh · August 12, 2023, 2:13am

I think that’s just a copy-and-paste error, when they copied too much code from rnn_cell_forward() and never noticed.

paulinpaloalto · August 12, 2023, 3:12pm

Right, it looks like a mistake. You don’t need those values to compute anything. At least I didn’t when I followed the instructions. Note that they make a couple of comments in the instructions about how they are basically leaving out the y path in the computations as well:

Note: rnn_cell_backward does not include the calculation of loss from 𝑦⟨𝑡⟩�⟨�⟩. This is incorporated into the incoming da_next. This is a slight mismatch with rnn_cell_forward, which includes a dense layer and softmax.

And in the next section they say:

Note that this notebook does not implement the backward path from the Loss ‘J’ backwards to ‘a’.

** This would have included the dense layer and softmax which are a part of the forward path.*
** This is assumed to be calculated elsewhere and the result passed to rnn_backward in ‘da’.*

Sorry for the crummy formatting there.

Topic		Replies	Views
Building_a_Recurrent_Neural_Network_Step_by_Step Sequence Models week-1	4	337	January 4, 2024
Question on backpropagation; week 1, prog. assignment 1 Sequence Models week-1	1	297	January 26, 2024
C5, W1A1 optional RNN back propagation Sequence Models	10	947	January 2, 2024
Need help with DL specialization Course 5 Lab1 rnn_backward function Sequence Models	4	419	August 6, 2023
C5W1A1Exercise3.1 Sequence Models	5	751	March 5, 2024

Help Understanding Rnn Backprop Exersize Week 1 Assignment 1

Related topics