In this @7:30 seconds of this video timestamp it talks about ranking movie similarities using the following formula:
|| V(k)_m - V(i)_m ||^2
where
V(k)_m - V(i)_m results in another vector lets say V_d whose Norm-2 length represents how far apart the two vectors we started with, both in terms of magnitude and their angle.
|| V_d || I assume just a notation for laking Norm-2 length of V_d which is in expanded form is sqrt(V_d1^2 + V_d2^2 + ... V_dn^2)
Question:
Q1: Why do we square || V_d || ? is that because sqrt can be both positive & negaitive and so we eliminate the negative option by using sqrt
Normally you would use L2 norm because its more punitive ie. its more sensitive than L1 norm, even if there is small difference (>1) between these two vectors if squared the penalty would become large unless they are close to each other, like in a range [0…1] the penalty would be reduced by a power of 2!
@alexshagiev I will defer to @gent.spah because I have not taken this exact class here-- But, as I recall, like SVD will basically implode your sparse matrix… But I don’t recall us having to deal with any norm there.
If I recall correctly, this exercise uses the norm squared in order to avoid computing the square root. It’s a notational trick that is intended to lead you to just computing the sum of the squares of the differences.
Since the square root is monotonic increasing, it will not change the relative comparison between different movies. So computing the square root would consume some calculation resources but add no value.