C1_W4_Assignment - UNQ_C17 hash_value_of_vector

Thorsten_Tepper · October 8, 2023, 9:06pm

Hi,

when trying to implement the hash_value_of_vector function and running it with the example, I get a hash value that is too high:

The hash value for this vector, and the set of planes at index 0, is 1023

Expected Output

The hash value for this vector, and the set of planes at index 0, is 768

I use np.dot() to get the dot product of v and planes, then np.sign() for the sign of the dot product.
When determining “h”, I use np.all() on sign_of_dot_product. Could this be the reason for why the calculated hash value ends up being too high?
My n_planes variable and the hash increment look correct to me.

Best wishes and hope someone can help,
Thorsten

paulinpaloalto · October 8, 2023, 11:09pm

Maybe this is just a failure of imagination on my part, but I’m not sure why np.all would be helpful there. I just did a Boolean compare of the vector with 0 and got the correct answer for that part. But I guess np.where would be a more complex way to achieve the same result. Is that maybe what you meant?

Regards,
Paul

paulinpaloalto · October 8, 2023, 11:16pm

I added some print statements to the code to show the dot product:

[[ -7.72316876 -11.20125051  -1.99766313  -9.49587175  -9.03942449
   -0.43126041  -2.05516654  -5.64313064   0.89265172  16.34941659]]
<class 'numpy.ndarray'>
(10,)
shape (10,)
values [False False False False False False False False  True  True]
 The hash value for this vector, and the set of planes at index 0, is 768 and the "sign of dot product" vectors

So in my calculation only the last two entries generate non-zero values and you get:

2^8 + 2^9 = 256 + 512 = 768

In fact, if I think just \epsilon harder, the only way you could get 1023 as the answer there is if all 10 of your elements are true. That’s because:

\displaystyle \sum_{i = 0}^n 2^i = 2^{n + 1} - 1

So how did that happen?

Thorsten_Tepper · October 9, 2023, 10:37am

Hi there,

Thanks for taking the time to respond! Your explanation makes a lot of sense.

However, it still doesn’t quite work for me. When adding print statements, I can see that we have the same dot product:

[[ -7.72316876 -11.20125051 -1.99766313 -9.49587175 -9.03942449
-0.43126041 -2.05516654 -5.64313064 0.89265172 16.34941659]]

So far, the array has the shape (1,10), which is also explained in the code comments.

When printing sign_of_dot_product, I get:

[[-1. -1. -1. -1. -1. -1. -1. -1. 1. 1.]]

So we can see that only the final two are considered positive.

However, something seems to go wrong when defining “h”. I use the procedure from the second week 4 lab, i.e. the one with the if-else logic on the same line. If I do that, it tells me:

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

This is why I tried np.all() and while it does remove the error, the final hash value will be too high.

Would you have any further suggestions?

Best,
Thorsten

Thorsten_Tepper · October 9, 2023, 1:34pm

Hi again,

I managed to solve it thanks to some of your suggestions. Much appreciated!

The instructions in the code comments suggest that the dimensions of the dot product should be (1, 10) and this is also what you should feed into np.sign().

### START CODE HERE ###
# for the set of planes,
# calculate the dot product between the vector and the matrix containing the planes
# remember that planes has shape (300, 10)
# The dot product will have the shape (1,10) 

# get the sign of the dot product (1,10) shaped vector

However, if I transpose and squeeze the dot product before getting the signs, resulting in (10,) instead, the final hash value will be correct.

As it turns out, I also forgot to index h with i in the loop.

Thanks for your help and have a great day!

Best,
Thorsten

paulinpaloalto · October 9, 2023, 2:29pm

Glad to hear that you found the solution. The point is that you don’t need the sign of hash to be -1 or 1. You need it to be 0 or 1. Then things work in the loop over h[i]. The point there is that you include the corresponding 2^i value if and only if the corresponding sign is positive.

Also you are misinterpreting the purpose of np.all. Think a bit harder there. The point is you can’t say:

if booleanVector:

Because it has multiple elements and what does that mean? But np.all and np.any return a single Boolean value depending on whether the input Boolean vector has all true elements or any true elements, How is that helpful in this particular case? And note that with your vector containing -1 and 1, all values will be considered True from the standpoint of np.all. If the input values are not actually Boolean True and False values, then it does the type coercion by saying in effect:

booleanVector = (inputVector != 0)

Meaning that only values exactly equal to 0 will convert to Boolean False.

Regards,
Paul

Topic		Replies	Views
W4 Assignment - Problem with hash_value_of_vector NLP with Classification and Vector Spaces week-module-4	3	546	January 9, 2023
Can anyone help in writing hash_value_of_vector function NLP with Classification and Vector Spaces week-module-4	1	620	March 31, 2022
Problem with Exercise 10 NLP with Classification and Vector Spaces week-module-4	2	561	June 3, 2022
I can't compute a hash value for a given vector! NLP with Classification and Vector Spaces week-module-4	7	550	December 9, 2022
W4 Assignment - hash_value_of_vector NLP with Classification and Vector Spaces week-module-4	1	528	May 3, 2023

C1_W4_Assignment - UNQ_C17 hash_value_of_vector

Expected Output

Related topics