there is something I don’t understand. In exercise 8 we calculate loss 1 with one formula and in exercise 9 we calculate loss 2 with a different formula. Hence the results of my calculations are different. In my code L1 = 1.1 and L2 = 0.43
But the vectors given are exactly the same. Should not the loss function result should also be exactly the same or is the “name” loss function something that refers to multiple ways of calculation and therefore has multiple different results and multiple statements/interpretations of the result?
I thought THE loss function is always just one and the same function?
Or is something wrong with my calculations and in exercise 8 and 9 it should be L1 = L2?
Euclidean distance, Mahalanobis distance, cosine between two vectors, complex custom algebra scaling and combining diverse error components (YOLO object detection combines location, shape, and classification into a single loss value), and many more.
The loss you compute and minimize depends on the learning task objectives.
Since I am not currently enrolled in the course I can’t see the code or confirm those values. Perhaps one of the official mentors like @paulinpaloalto can do so?
Yes, those are the same results I get. (Just on general principles, you can also use the grader to check your answers.) If you look at the two formulas, they are different, which is why the answers are different. L1 is the sum of the absolute values of the differences between the elements of the y and \hat{y} values. L2 is the sum of the squares of those differences. Those will give you different answers with most inputs. As ai_curious points out, there are many different loss functions which are used in different situations, depending on the nature of the problem you’re trying to solve and what the outputs of your “model” represent.
This is just an exercise to introduce you to some numpy constructs. We will not be using either the L1 or L2 loss functions as presented here in any of the following material in the courses. They are mainly used for “regression” problems, where the output of the model is a continuous number of some sort (price of a stock, the temperature or humidity or …) Everything we will be doing here are “classification” problems, where the answer is either “yes/no” (the picture contains a cat or does not contain a cat) or one of a number of classes we are trying to recognize (cat, dog, horse, kangaroo, elephant, wombat, car, truck, ship, airplane …). For classification problems “distance” based cost functions like L1 and L2 are not useful. Prof Ng will introduce you to the so-called “cross entropy” loss function (also called “log loss”), which is specifically designed for classification problems. Also note that most cost functions are the average of the loss values across the samples, but they kept it simple here in the very first assignment.
@Lostfinger: You don’t have to use dot product for that computation. It is just a suggestion. There are two fundamental ways to do it:
As a two step process: first use np.multipy or * and then use np.sum to add up the products.
Use np.dot to do both the multiply and the sum in one step.
Either method is perfectly correct, but the dot product is more efficient. Why do two operations when one can do the same thing? It’s less code to write and it runs faster, so what is not to like about that?