What should be denominator in computing accuracy?

I understand about using mask to get rid of pad tokens and keep only actual predictions. So the number of correct predictions should be np.sum(outputs * mask == labels * mask). But what should be the total number of predictions then? I tried np.sum(mask), but got a wrong accuracy, larger than 100%.
Printing out my number of predictions shows that it is smaller than the number of correct predictions, so using np.sum(mask) should be wrong. But why? What should be the correct answer then?

My code, FYI:

    mask = (labels != pad)
    n_correct = np.sum(outputs * mask == labels * mask)
    n_prediction = np.sum(mask) 
    print("no. of correct predictions:", n_correct)
    print("total actual predictions:", n_prediction)
    accuracy = n_correct / n_prediction

The problem is not with the denominator: it is with the numerator. Notice that the way you wrote that, you get a True value for every position in which mask is 0. After all 0 == 0 is true, right?

Oh, I got it. Thanks for the explanation. So I guess I need something like: sum(outputs == labels) where mask == 1.

Yeah, I got it right this time, thanks @paulinpaloalto .