Correct me if I’m wrong.

Your answer is:

```
−[0⋅log(0.7)+1⋅log(0.3)] = −log(0.3) ≈ 1.204
−[1⋅log(0.6)+0⋅log(0.4)] = −log(0.6) ≈ 0.511
Total Loss:
1.204+0.511 ≈ 1.715
```

However, my explanation is:

- Consider using sigmoid or binary entropy:

```
Truelabel: 0 -> -(0*log(positive prob) + (1-0)*log(1-positive prob)) = -(0*log(0.3) + 1*log(0.7) = -log(0.7)
Truelabel: 1 -> -(1*log(positive prob) + (1-1)*log(1-positive prob)) = -(1*log(0.6) + 0*log(0.4)) = -log(0.6)
```

- In softmax or multi-class entropy approach, either one-hot or nll_loss

```
Truelabel: nll_loss method to use class-index:0 or one-hot method to use vector [1,0] -> choose class 0 prob: -log(0.7)
Truelabel: nll_loss method to use class-index:1 or one-hot method to use vector [0,1] -> choose class 1 prob: -log(0.6)
```

I think both entropy approaches should yield the same answer, and it was exactly what I expected.