First, I am not sure whether I am supposed to post a question here if I am having trouble with one of the assignments. [I am aware of the honor code; so I am not sure whether it is appropriate to ask for help here. If not, where do I ask for help?]
I will try to rephrase the problem here so as to not make it the same as in the assignment problem.
I am trying to program this definition:
def prop(w, b, X, Y):
“”"
Arguments:
w -- weights, a numpy array of size (num_px * num_px * 3, 1)
b -- bias, a scalar
X -- data of size (num_px * num_px * 3, number of examples)
Y -- true "label" vector (containing 0 if non-cat, 1 if cat) of size (1, number of examples)
Return:
cost -- negative log-likelihood cost for logistic regression
"""
m = X.shape[1]
A = sigmoid(np.dot(𝑤.T,X) + 𝑏)
I am trying to code the following cost function in math notation:
#cost = −(1/𝑚)∑ [𝑦(𝑖).log(𝑎(𝑖)) + (1−𝑦(𝑖)).log(1−𝑎(𝑖))] #So here is my thinking:
#“m” is a scalar. Y is a row vector. A is a row vector as well, I think, since I just
took the sigmoid of a dot product of W-transpose and X and added scalar b.
Y and A are of the same size.
So, for the first half of the above, I am doing np.dot(Y,np.log(A)) to get y(i).log(a(i)
#Then for the second half I am doing np.dot((1-Y),np.log(1-A)) to get (1-y(i)).log(1-a(i)) #and then I’m adding the two halves up and multiplying by -(1/m). #So this is my final code for “cost” :
cost = -(1/m)*(np.dot(Y,np.log(A))+np.dot((1-Y),np.log(1-A)))
But I am getting an error message. Where am I going wrong? [I have been having problems understanding the numpy codes for matrix additions, multiplications, etc.]
Welcome to Discourse @ARG, and good luck with DLS!
Can you share the error message you received? Since you already posted your question the way you did, I will try and answer generally, without revealing much on the solution itself. It would be more appropraite next time to try and avoid posting your solution here. In this particular case I think you could have written your question in a way that would not have revealed a possible answer to the question in the assignment.
Anyway, as a general suggestion, I would check the dimensions of the tensors going into the dot product. The shape of them and its order
I’m just guessing here but are you getting an error about array dimension mismatch in cost formula calculation? If yes, then look at your cost formula again.
You probably remember that dot product requirement is that number of columns in first matrix and number of rows in second matrix should match, and the result will have number rows matching first matrix and number of columns matching second matrix : R(m,n) dot S(n,p) = Q(m,p).
Since both A and Y have same dimension, you need to manipulate dimension of either of them to fulfill dot product requirements. Hint: the result should be a single number not a matrix.
Thanks @vjmalkoti Vijay! This was very helpful.
I am now done to the end of that “Logistic Regression with a Neural Network Mindset” assignment, but now I have a weird problem which other people have also mentioned, but I am not sure that anybody got to the bottom of it.
The information cell below the “Logistic_Regression_Model” says “Training accuracy is close to 100%. This is a good sanity check: your model is working and has high enough capacity to fit the training data. Test accuracy is 70%. It is actually not bad for this simple model, given the small dataset we used and that logistic regression is a linear classifier. But no worries, you’ll build an even better classifier next week!” But I am getting a Training Accuracy of 68.42%, and test accuracy of 34%: exactly what some other people are getting on Discourse. [See for example, post by @soumdtt ] So, either we’re all making the same mistake, or there’s some issue.
Also, the “Plot Learning Curve” is not working as set up. I don’t believe I was supposed to do anything here, I was just supposed to run it… I think. The graph template is showing, but there’s no data on the graph.
I am sure many of my issues are due to my inexperience with Python. I have used some lower level languages in the past, and am still struggling with this. I just ordered the Oreilly book “Python for Dat Analysis” by Wes McKinney for a crash course.
As you saw in my earlier post, I managed to figure that out.
I do, however, still have a problem understanding what/how I can ask a question when I am stuck and need help. Since my earlier “solution” was obviously incorrect, I thought it would be okay to walk through my thinking: I did not think I was giving the solution away because I obviously did not know the solution. But I see your point that I may be providing clues to others. But how do I seek help in that case? Is there a way to do a “private message” or something like that, so I reach the mentors but not publicly in some cases?
That indeed is strange. I was able to get train accuracy=99.04306220095694% and test accuracy=70.0 %. You need to debug to pinpoint the cause of the issue in your code. Do the test outputs in all other intermediate steps match with expected results?
When you need help, you can walk people through your approach by 1) describing your steps instead of writing the actual code, or 2) write sample code lines when describing it is too verbose - you see these in assignment notebooks. I think you can also add debugging print statements in your code and post output to give us a glimpse of the model state.