I have a question concerning the computation of the accuracy of a NN. How exactly is it computed and implemented in python? I believe that it has never been mentioned in the course.
You can find examples of accuracy computations in various notebooks. Every time a model is trained, you get accuracy scores on both the training and test data. For example, check the Logistic Regression assignment in Course 1 Week 2. The definition of accuracy is simple: the fraction of predictions that agree with the labels, although in some cases they choose to express it as a percentage. You have the labels Y and you get the output of the model A. Then you have a function, e.g. predict in the Week 2 assignment, that converts the sigmoid values of A into 0 or 1 predictions. So in python, you can do something as simple as:
acc = np.mean(Y == predict(A))
That doesn’t actually use the same function signature as the real predict function, but suffices to make the point here.
Thank you for clarification! Your explanation is very straightforward. However, in the programming assignments I cannot always comprehend the computational steps to derive the accuracy, especially in the Week 3 assignment. Here, the steps to compute the accuracy seem similar to parts of the cost function. Why is that?
The fundamental idea is always the same as I described above, but there are lots of ways you can express that in python code. In the Week 3 case, they use dot products. Well, think about it for a second:
If Y is the labels (either 0 or 1) and is a 1 x m vector and predict(A) is the predictions either 0 or one and also a 1 x m vector, then what happens if I compute this dot product:
Y \cdot predict(A)^T
1 x m dotted with m x 1 gives you a 1 x 1 or scalar output, which is this sum in math terms:
\displaystyle \sum_{i = 1}^{m} y_i * predict(a_i)
If either of the terms is 0, the product is zero. So that sum gives the number of cases in which the label is 1 and the prediction is also 1. So it is the number of correct predictions on cases in which the label is 1, right?
Now apply the same reasoning the (1 - Y) case. It just requires reading the code and understanding what it is doing and then thinking about what that means.
Thank you a lot. Your answers are always quite helpful!