This was the cost implemented in C3_W2_Collaborative_RecSys_Assignment
def cofi_cost_func_v(X, W, b, Y, R, lambda_):
j = (tf.linalg.matmul(X, tf.transpose(W)) + b - Y)*R
J = 0.5 * tf.reduce_sum(j**2) + (lambda_/2) * (tf.reduce_sum(X**2) + tf.reduce_sum(W**2))
return J
- Why are taking the transpose
- If we’re taking transpose to match the matrix dimensions, aren’t we multiplying wrong values(because of T) to match the matrix dimension
Hi @Naren_babu_R,
Let’s check out the definitions for X
and W
.
def cofi_cost_func_v(X, W, b, Y, R, lambda_):
"""
X (ndarray (num_movies,num_features)): matrix of item features
W (ndarray (num_users,num_features)) : matrix of user parameters
"""
To have a movie vector dotting with an user vector through matrix multiplication, we need that transpose. Are you familiar with matrix multiplication?
Raymond
Not understanding, Can you please explain the use of transpose?
While doing dot product implementation, we didn’t take transpose, but in vectorized implementation why are taking transpose
Hi @Naren_babu_R,
Follow my flow.
First, consider this matrix multiplication A = BC.
Tell me, do you agree that the element A_{ij} is the dot product of the i-th row of B and the j-th column of C?
If you are not familiar with Martix multiplication, google about it.
Raymond
I agree, rows in A gets multiplied with columns in B
That is not just any multiplication. It is an operation called the dot product. I assumed you have agreed with that too.
Alright, then, given the definitions
def cofi_cost_func_v(X, W, b, Y, R, lambda_):
"""
X (ndarray (num_movies,num_features)): matrix of item features
W (ndarray (num_users,num_features)) : matrix of user parameters
"""
I say X arranges movie vectors in rows, and W arranges user vectors in rows too. They are both in rows. Agree?
If so, then there is a problem. Matrix multiplication dots a row vector to a column vector, but they are both arranged in rows in X and W. What do we do?
We transpose W to rearrange its user vectors in columns. If you are not sure what transpose actually does as a mathematical operation, please google for more examples.
Now X arranges movie vectors in rows, and W^T arranges user vectors in columns, with the fact that a matrix multiplication of XW^T dots the row vectors of X with the column vectors of W^T, we are able to dot each movie vector with each user vector.
That’s it!
Raymond