Week1: Cost function for LM & sequence Generation

The cost function for logistic regression is -y*log(y_) - (1-y)log(1-y_).
But for the problem of Language Model & sequence generation (precisely at time 10:26), it was told that cost is just -y
Isn’t second term needed?

The two forms are equivalent, except one uses probabilities and the other uses discrete values.