Hello,
I have difficulties understanding the concept of multi-output network. How can the optimization of the network be done if there is two output and thus two loss functions to optimize together ? How is the gradient computed under the hood ?
Thanks in advance for the help