Why we use torch.zeros_like and torch.one_like when calculating the loss

Those are two separate issues. The purpose of ones_like and zeros_like as opposed to torch.zeros or torch.ones is that the “like” functions also allocate the new tensor on the same device as the base tensor without any work on your part to figure out what the device assignment actually is.

The detach question is a completely different issue. Please see this thread for an explanation.