Dropout regularization exercise

Hi. I have a problem with W1 home work3 . As mentioned in the homework D1 and D2 dimension should be the same as A1 and A2 respectively. I don’t know what are the sizes for them and I wrote D1 = np.random.rand(A1.shape) but I get an error. @paulinpaloalto can you please help me with this part?
Also how we can see the correct answers for the home works?

Thank you so much.

For this portion,

D1 = np.random.rand(A1.shape) but I get an error.

Different from other random number generators, np.random.rand and np.random.randn are ported from Matlab (another famous programming environment) for the convieenience, and do not accept a tuple like (a,b).

A1.shape returns a tuple like (2,5). To pass this shape to np.random.rand, there are two ways.

  1. break down the tuple into the first and second parameter like A1.shape[0] and A1.shape[1].
  2. use * in front of a tuple like *A1.shape

Other random number generators like np.random.random\_sample accepts a tuple.
So, it should be OK with np.random.random\_sample(A1.shape)

1 Like

It was a great explanation and solved the issue. I used *A1.shape. What happens by using * here? I know that np.random.randn(()) gets tuple as the input but np.random.rand() gets inputs as individulas. Using * gets rid of tuple?

An asterisk is to “unpack” list, tuple, and so on. But, we need to be very careful to use this, since there are some restrictions.
An example is;

As “multiple” uses the same character, the use of this for unpacking is not flexible.
In this sense, breaking a tulpe down into multiple parameters is straightforward.

1 Like

Very nice. I got it.

Thank you so much for the quick and helpful response.

Regarding regularization in deep learning, I have seen that in tensorflow layers we have kernel_regularizer. I was wondering how using this parameter in the layer can affect the loss function as a regulariser cost function that Andrew Ng was describing.

If any “regularizer” is specified, a penalty is calculated based on specified regularization type in a layer instance, like Dense(). These values are summed into the loss function that you specify separately. This is pretty much aligned to what Andrew talked.
By the way, there are three types of regularization for a layer. Those are;

  • kernel_regularizer
  • bias_regularizer
  • activity_regularizer

“kernel_regularizer”, aka weight_regularizer, is for weight regularization, and “bias_regularizer” is for bias regularization, that Andrew also introduced, but he also said that he omitted. “activity_regularizer” is for layer output.

I think this is well-designed. In some cases, we may want to prepare our own “loss function”. To add a regularization term, we just call “layer.losses” to get those. We do not need to re-calculate them in our custom loss function.