W3 Assignment - 5.2 Accuracy formula derivation

Charalampos_Inglezos · June 7, 2023, 6:15pm

In Course1 - Week3 assignment, at 5.2 section it calculates the accuracy using cross-entropy formula as: accuracy = np.dot(Y, predictions.T) + np.dot(1 - Y, 1 - predictions.T)) / float(Y.size) * 100

Where did “np.dot(Y, predictions.T) + np.dot(1 - Y, 1 - predictions.T)” part come from? This is just like cross-entropy loss formula, using the format “y*log(y_hat)”

I thought that accuracy for binary classification comes from TP, TN etc: (TP+TN)/(TP+TN+FP+FN), not by cross-entropy derivation. Does anybody know why the course uses this formula?

TMosh · June 7, 2023, 6:38pm

That is one way to compute the accuracy, but training is based on minimizing the cost. Cost is based on the loss formula.

Charalampos_Inglezos · June 7, 2023, 6:40pm

TMosh · June 7, 2023, 6:41pm

I have updated my previous reply to correct a mistake.

Charalampos_Inglezos · June 7, 2023, 6:43pm

Okay then how accuracy is derived from that formula with the np.dot(Y, predict) and np.dot(1-Y, 1-predict)? Where did this come from? Does it exist anywhere in the theory/notes? I cannot find anything related to that

TMosh · June 7, 2023, 6:58pm

I’ll have to resurrect my access to that course to look at the materials. Typically I’m mentoring DLS Course 4 and Course 5

paulinpaloalto · June 7, 2023, 7:03pm

It’s simple: accuracy is the percentage of correct predictions on a given batch of inputs. So that formula has nothing to do with cross entropy loss, even though it may superficially resemble it (notice there are no logarithms there).

If Y is a vector of labels (0 or 1) and \hat{Y} are the predictions based on the output of the model (also 0 or 1), then think about what this dot product will give you:

Y \cdot \hat{Y^T}

If Y_i = 1 and \hat{Y}_i = 1, then the answer will be one, otherwise it will be zero, right? So the dot product adds those up and you get the number of predictions for cases in which the label is 1 that the prediction was correct (also 1), right?

Now apply that same reasoning to the other term ((1 - Y) \cdot (1 - \hat{Y}^T)) and it should all make sense.

The other formulas you show containing TP and TN and FP and FN are a different way of assessing the results of a model. They give you the “precision” and the “recall” and the “F score”. Here is the Wikipedia page about that. Accuracy is a much simpler and straightforward metric.

Charalampos_Inglezos · June 7, 2023, 7:08pm

Thank you! Yes this must probably be the explanation.
It would be very useful if this had been mentioned in the theory though.

Topic		Replies	Views
W3_A1_Ex-5.2_Understanding math behind accuracy calculation Neural Networks and Deep Learning coursera-platform	2	406	July 21, 2023
General question - accuracy function Neural Networks and Deep Learning coursera-platform	4	527	January 9, 2022
Accuracy formula Neural Networks and Deep Learning coursera-platform	3	561	December 16, 2021
Week 1, lab 2, counting labels and weighted loss AI for Medical Diagnosis week-module-1	3	383	November 10, 2023
Week 3 Assignment Binary cross entropy Improving Deep Neural Networks: Hyperparameter tun coursera-platform	4	1050	May 17, 2025

W3 Assignment - 5.2 Accuracy formula derivation

Related topics