mean_reduce(max(logits, 0) - logits * labels + log(1 + exp(-abs(logits))), axis=-1)
Is it this is same as L= -y*log(a_L)+(1-y)*log(1-a_L) ?
mean_reduce(max(logits, 0) - logits * labels + log(1 + exp(-abs(logits))), axis=-1)
Is it this is same as L= -y*log(a_L)+(1-y)*log(1-a_L) ?