Clarification about retrieval tradeoff


I am having confusion understanding the second bullet point.
What exactly $$p(y^(i,j))=1$$ mean? The instructor mentions: " estimated probability that YIJ Is equal to one according to your neural network model"

Isn’t notation a bit incorrect?
The estimated probability (the dot product of V_u and V_m) shouldn’t be represented as y^(i,j), because y^(i,j) are the true labels, not predicted ones ( as apparent by the cost function shown in previous video). So something like f(x_u, x_m) or f(V_u, V_m) should be used, right? (ill call it prediction for now)

Also, it should be p(prediction=1), not p(prediction) = 1, right? Because the instructor says,“estimated probability that YIJ Is equal to one according to your neural network model”

Or does it mean something else? I am having difficulty understanding the second bullet point, especially the part in the brackets.

Thank you

In general, y-hat is the prediction, and y is the label.