Week 2 assignment, 6.1 - Mini-Batch Gradient Descent

Hi, in section 6.1 - Mini-Batch Gradient Descent, when I ran the code block below, I got this error:

IndexError                                Traceback (most recent call last)
<ipython-input-85-6c6daf13604b> in <module>
      1 # train 3-layer model
      2 layers_dims = [train_X.shape[0], 5, 2, 1]
----> 3 parameters = model(train_X, train_Y, layers_dims, optimizer = "gd")
      5 # Predict

<ipython-input-84-620ef2b96663> in model(X, Y, layers_dims, optimizer, learning_rate, mini_batch_size, beta, beta1, beta2, epsilon, num_epochs, print_cost)
     59             # Backward propagation
---> 60             grads = backward_propagation(minibatch_X, minibatch_Y, caches)
     62             # Update parameters

~/work/release/W2A1/opt_utils_v1a.py in backward_propagation(X, Y, cache)
    153     gradients -- A dictionary with the gradients with respect to each parameter, activation and pre-activation variables
    154     """
--> 155     m = X.shape[1]
    156     (z1, a1, W1, b1, z2, a2, W2, b2, z3, a3, W3, b3) = cache

IndexError: tuple index out of range

Although all the previous code blocks work fine except for the block
Exercise 2 - random_mini_batches, I got this error:

AssertionError                            Traceback (most recent call last)
<ipython-input-35-2caeb7a27c84> in <module>
     11 assert n_batches == math.ceil(m / mini_batch_size), f"Wrong number of mini batches. {n_batches} != {math.ceil(m / mini_batch_size)}"
     12 for k in range(n_batches - 1):
---> 13     assert mini_batches[k][0].shape == (nx, mini_batch_size), f"Wrong shape in {k} mini batch for X"
     14     assert mini_batches[k][1].shape == (1, mini_batch_size), f"Wrong shape in {k} mini batch for Y"
     15     assert np.sum(np.sum(mini_batches[k][0] - mini_batches[k][0][0], axis=0)) == ((nx * (nx - 1) / 2 ) * mini_batch_size), "Wrong values. It happens if the order of X rows(features) changes"

AssertionError: Wrong shape in 0 mini batch for X

Here’s my implementation of that part:

        s = m - mini_batch_size*num_complete_minibatches
        mini_batch_X = shuffled_X[:, s]
        mini_batch_Y = shuffled_Y[:, s]        

What could be the issue?

That indexing operation will give you a single sample for X and Y. You need a range for the second index, right?

The previous error says that the X you passed is not a 2D array, which should be explained by the indexing mistake.

When in doubt, print the shape of your mini_batch_X after the above logic. What do you expect it to be? What is it actually?

With mini_batch_X = shuffled_X[:, :mini_batch_size]
I got mini_batch_X.shape --> (12288, 64)

With mini_batch_X = shuffled_X [:, s] as mentioned above,
I got mini_batch_X.shape --> (12288,)

With mini_batch_X = shuffled_X[:, m]
I couldn’t get mini_batch_X.shape
and I got this error:

IndexError                                Traceback (most recent call last)
<ipython-input-22-2caeb7a27c84> in <module>
      6 Y = np.random.randn(1, m) < 0.5
----> 8 mini_batches = random_mini_batches(X, Y, mini_batch_size)
      9 n_batches = len(mini_batches)

<ipython-input-21-17408424963e> in random_mini_batches(X, Y, mini_batch_size, seed)
     46         # mini_batch_Y =
     47         # YOUR CODE STARTS HERE
---> 48         mini_batch_X = shuffled_X[:, m]
     49         mini_batch_Y = shuffled_Y[:, m]
     50         print(mini_batch_X.shape)

IndexError: index 148 is out of bounds for axis 1 with size 148

I still don’t understand what should be the shape of mini_batch_X …

But the point is that you computed s, which is the first index in the last (partial) minibatch. So what does the range need to be for that last minibatch? It’s s to the end, right? So how do you express that? It’s a range starting with s and ending with …?

It looks like maybe you’re pretty new to indexing in python. Let’s analyze what is wrong with those three choices you made:

  1. The range there is :mini_batch_size. The way ranges work in python is start:end:step. So you left out the start and said that the end is mini_batch_size. So that will start at the beginning and give you the first minibatch. That is clearly not what you want, right?
  2. The range there is just s, so you get one sample at index s, which (as I predicted in my earlier reply) has a 1D shape meaning it has only one dimension, which is why shape[1] is out of range.
  3. The range there is just m, so it would be just one sample like s, but in python indexing is 0 based, right? So if there are m elements in an array, the last one is at index m - 1. So m is off the end, which is why you don’t get a shape for that. It’s an error.


        s = m - mini_batch_size*num_complete_minibatches
        mini_batch_X = shuffled_X[:, s:inc]
        mini_batch_Y = shuffled_Y[:, s:inc]

where inc = mini_batch_size
I got mini_batch_X.shape --> (12288, 44), which is smaller than mini_batch_size=64, but why didn’t it work?

Why does inc make sense as the end of the range? Please read my most recent response carefully. The second element of a range is not the length: it’s the last index value.

1 Like

Finally, got it!
Thank you very much!