I just removed the ReLU from the output layer and used leaky_relu. But I got the same result, I guess, it is because I did not define the leaky_relu_backward and instead, use relu_backward.
I copied the Relu Backward’s code below but what will be the leaky_relu_backward? Derivative of A = np.maximum(0.01*Z,Z)?
def relu_backward(dA, cache):
Z = cache
dZ = np.array(dA, copy=True) # just converting dz to a correct object.
dZ[Z <= 0] = 0
assert (dZ.shape == Z.shape)
return dZ
Do you mean remove the relu and just use linear function, like below? Z^l = A^l = np.dot(W,A^{l-1}) + b^l)
I sent it to you, Raymond.
In the last, I would like to thank all of you for your unbounded and limitless time and guidance. Highly indebted to you.
Saif.
linear_forward A (20, 1)
linear_forward W (7, 20)
linear_forward b (7, 1)
linear_forward Z (7, 1)
The thing is, the shape of your W should be (number of features, number of neurons) which means (1, 7), although my preference is (number of neurons, number of features).
I will hold on reading your notebook, because it seems you have something to change and check. Checking shapes can be an interesting exercise
Btw, (number of neurons, number of features) is meant for the first dense layer. The second dense layer should have (number of neurons in this layer, number of neurons in the last layer)
The shape of your W in the first dense layer is linear_forward W (7, 20). 20 is the number of samples. Did I misunderstand anything? Or would you like to check all the shapes in your notebook?
I just check all the shapes of my file. They are:
print(f"shape of X is {X.shape}“)
print(f"shape of Y is {Y.shape}”)
print(f"number of neurons in hidden layer is {n_h}“)
print(f"shape of W1 is {W1.shape}”)
print(f"shape of b1 is {b1.shape}“)
print(f"shape of A1 is {A1.shape}”)
print(f"shape of W2 is {W2.shape}“)
print(f"shape of b2 is {b2.shape}”)
print(f"shape of A2 is {A2.shape}")
shape of X is (20, 1)
shape of Y is (20, 1)
number of neurons in hidden layer is 7
shape of W1 is (7, 20)
shape of b1 is (7, 1)
shape of A1 is (7, 1)
shape of W2 is (20, 7)
shape of b2 is (20, 1)
shape of A2 is (20, 1)
Oh. Number of features is 1 (column of X). So, I need to change that. Currently, I am using (number of neurons, number of samples) = (7,20) for W1 but it should be (number of neurons, number of features) = (7,1). Right?
Saif, I have been following this conversation and seen a lot of your effort. I can guess the error you are seeing, but I hope you will try googling the error message first and debug it yourself. Some said writing the code takes only 30% of the whole coding time but debugging takes 70%. I don’t want to do that major work for you.
Now you know the right thing to do, so it’s time to do it right. I know you will try it first, right?
Yes, that was what I was recommending. You can try LeakyReLU as well, but I think it’s worth trying just omitting the output activation function altogether.
If you implement LeakyReLU, here’s one way to code its derivative:
def leakyreluprime(Z, slope = 0.05):
G = np.where(Z > 0, 1, slope)
return G
Of course that is implemented as a separate function call. If you build it “in situ” by analogy to the way relu_backward works, you’re doing two things at once:
dZ = dA * g'(Z)
But the same idea can be adapted …
One point to emphasize here is that if you just duplicate the code in relu_backward to make leaky_relu_backward, be sure to understand the importance of the way they implemented this line:
If you “short-circuit” that by eliminating the “copy” there:
dZ = dA
that is a disaster, because you’re about to overwrite some of the values in dZ. Because of the way that parameter passing works in python and the way object assignments work, doing it without the copy modifies the global value of dA. See this post and this later reply on that thread.
Hello @paulinpaloalto@rmwkwok@Rashmi! I hope you are doing well. Every time I ask silly questions and you consistently guide me. Thank you for that.
I am omitting the relu activation function in the last layer. I copied many functions (from DLS course 1, week 3 and 4 assignments), so, for simplicity, I defined a new function named no_relu and then just changed the name in other functions.
This is how I defined that:
def no_relu(Z):
A = Z
assert(A.shape == Z.shape)
cache = Z
return A, cache
Now I need to define its derivative too.
I did like this:
def no_relu_backward(dA, cache):
Z = cache
dZ = np.array(dA, copy=True) # just converting dz to a correct object.
dZ = 1
assert (dZ.shape == Z.shape)
return dZ
Just making it 1. Is it correct or not? I am getting error at that point.
It’s good to see you experimenting in different ways! Initially you tried ReLu and then LeakyRelu, where you received almost the same results. Now, you have omitted the last layer (outer activation function). But what was your purpose to check these experiments? If it is to check the binary classification, then how can you just build on 1 as an output? It always checks between 0 and 1.
If you are doing so, then what about 0? And that makes me wonder, how dZ=1 will provide you actual results?
You can expand it beyond to check more of it. Thanks!