implementing the cost
with np.dot worked for me. But I was wondering: as far as I understand the result of log a(i) is scalar, that is we multiply the one-hot vector by a scalar, so why not regular multiplication? Moreover, doesn’t the symbol * represent in this course a regular multiplication?
Hi @Doron_Modan ,
Please have a look at the reference menu on np.dot(), specifically on the detailed explanation on the parameters passed to the function.
Yes, * means elementwise or “regular” multiplication. In that particular mathematical formula, it’s just scalar multiplication. But remember that dot product is two operations in one: you first take the elementwise product of the two vectors and then you add up the results. So if you have the inputs in vector form, using the dot product is just a more computationally efficient way to get the same result. It’s one vectorized operation instead of two (* or
np.multiply followed by
np.sum or the TF equivalent thereof).
It’s the same pattern as ever here: we are starting with a mathematical expression. Then the question is how to express that in linear algebra operations and how to translate those into python or numpy or TF code. There are frequently multiple correct ways to do that translation and even among the correct ways, there can be different performance implications.
An equation that has the “summation of a product of two vectors” is ready-made to implement using a dot product.