Implementing L2 normalization in L-layer NN

Hello Community,
I am trying to build a NN to classify mushrooms as edible or poisonous from scratch using a mushroom dataset from Kaggle. (just for practicing:sweat_smile:)

The dataset (8124 rows × 117 columns)
The neural network 4_layers ([8,4,3,1])
All have RELU as an activation function except for the last node which has Sigmoid as an activation function

After running the NN (without regularization), it gave:

  • train set error: 99.8%
  • dev set error: 99.6%

I suspected overfitting, so I decided to implement L2 regularization.

  • I edited the cost function the same as done in the Regularization Assignment.
  • I edited the update_parameter function as follows:

weight_decay = 1- ((alphalearning_rate)//m)
for l in range(L):
parameters[“W” + str(l+1)] =weight_decay
(parameters[“W” + str(l+1)])- (learning_rate * grads[“dW” + str(l+1)])

My question is: Is my implementation correct?? Should I edit dWs??

I assume you mean those are your “accuracy” numbers, as opposed to “error” numbers. If I’ve got that right, then I think you’re in good shape and don’t have to worry about overfitting.

If you do decide you need L2 regularlzation, that should not affect the “update parameters” logic. Including L2 regularization will add another term to the gradients, but the way you apply the gradients stays the same.

1 Like
  • For the overfitting issue: Forgive me, you are right, I meant accuracy.
  • For the L2 regularization: So, if I want to implement L2, I should edit my cost function and my gradients in the ‘‘backpropagation step’’,
    dW = np.dot(dZ,A_prev.T)/m *+ (alpha/m)W and not in the ‘‘Update parameter’’

Thank you for the answer :white_heart: