Squared distance is also sometimes called squared difference because it compares two vectors by squaring the differences of their vector components.
In this exercise, each movie is represented by a vector. Intuitively, two movies are said to be identical if every component in their vectors are the same, whereas they are similar if the components do not deviated from each other by much.
In other words, to find out two movies’ similarity, we compute their vector components’ differences. Since we only care about their difference, the sign does not matter - for example, in a vector component, a difference of -10 is as bad as a difference of +10. To get rid of that sign, we square the differences, for example, to convert both -10 and +10 into 100. Lastly, to come up with one simple metric value that summarizes all those squared differences, we sum them up!
In short, we do three steps: (i) take the differences; (ii) square the differences; (iii) sum the differences. The maths formula for the above steps is:
These steps will be illustrated through an example.
To begin with, let’s say we have two movie vectors with many components. For simplicity, we will also show the first three components:
subtraction is the simpliest maths operation for comparing two numbers, because the more the result is close to zero, the less the difference will be.
In the above figure, we do the subtraction element-wise (or component-wise). In the exercise, we do that with the help of a for loop.
As mentioned above, to focus on just the magnitude of the differences, we want to get rid of that sign before summing them up. There is one more reason for us to do that, because if we did not get rid of the sign first, then when we sum them up, positive values will get canceled by negative values! We want to accumulate the differences, not letting them cancel each other out!
In the above figure, the first three component differences are 0.1, -1.7, and +0.5. We don’t want to just sum them up and get -1.1 which makes it look smaller than even just one of the three components. Instead, we want to accumulate them!
So, we square them, because then, we will have (+0.1)^2 + (-1.7)^2 + (+0.5)^2 + ... = 0.01 + 2.89 + 0.25 + .... In this way, any non-zero difference in the components is only going to make the summation result larger, as we wish!
Alright, now, into the code.
This time, we are given two vectors in variable a
and b
. They are arrays (equivalent to vectors), and we are expected to finish this using a for loop. Remember, a for loop lets us go over the components in the vectors, where the i-th component of array a
is represented by a[i]
Step 1: component-wise subtraction
subtracted = <your code>
Step 2: component-wise squaring of the subtracted
. To square a variable x
, we code it as x ** 2
squared = <your code that gets `subtracted` involved>
Step 3: sum all squared differences up. We have our accumulator variable d
that is assigned with a value of 0
before the loop begins. All we need to do is to add all component-wise squared differences to the accumulator variable.
d = <your code that gets `squared` and `d` involved>
And it is done!
Cheers,
Raymond