Cost Function For Multi-Task Learning Needs Explaination

medbenchohra · May 29, 2021, 9:03pm

Hello,

In Course 3, Week 2, Chapter: Learning from Multiple Tasks, Video: Multi-task Learning, At 2:40 mark.
Why the cost function is averaged by dividing by m and not 4*m since there are 4 outputs to each example?

Thank you.

yanivh · May 30, 2021, 1:42am

Hi @medbenchohra and welcome to Discourse. Cost function is averaged by the number of samples (batch size) which is m. This is the convention, and it doesn’t matter how many output(s) the network has. You want to average the total error of the network (in this case, the inner sum of 4 outputs) over all samples, so you divide by m.

medbenchohra · May 30, 2021, 4:36pm

Hi @yanivh, thanks for your answer.

It’s much clearer now, the cost should be averaged using the number of the samples.
What got me confused is the inner sum of outputs, why shouldn’t that be averaged by the number of outputs? Is it wrong to consider the error of the network the average of outputs rather than the sum of outputs? Or any other metric for that matter such as the quadratic mean? And would that have any substantial effect on the performance achieved?

Thank you.

Topic		Replies	Views
DLS Course 3 Week-2 Difference between multi-task learning versus multi-class classification Structuring Machine Learning Projects coursera-platform	2	557	July 19, 2021
Course 2 Week 3: compute cost solution is wrong? Improving Deep Neural Networks: Hyperparameter tun coursera-platform	2	535	November 3, 2022
Course2_week2_assignment Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	614	June 28, 2021
Multi-task learning Structuring Machine Learning Projects coursera-platform	2	550	March 17, 2022
Cost function of multi-class classification Improving Deep Neural Networks: Hyperparameter tun coursera-platform	7	666	August 3, 2021

Cost Function For Multi-Task Learning Needs Explaination

Related topics