WK 1 - Lab 2 - Word Freq Dict - Squeeze()

In the lab’s,

def build_freqs(tweets, ys):

Although there are comments, the squeeze is still confusing–it wouldn’t have crossed my mind to use it.

“yslist = np.squeeze(ys).tolist()”

ys is same as ‘labels,’ so it seems they can be sent directly .tolist().

I tested this, outside of the function,

test = labels.tolist()

and the result seems the same.

Hello @Glenn_DiCostanzo!

In the notebook you are mention, the vector of labels is defined in such a way that it has only 1 dimension. So, in this case, the np.squeeze() function is really not useful.

But, it is important to mention that this function (build_freqs) is used in other notebooks.

For example, in the notebook “Visualizing tweets and the Logistic Regression model”, the vector of labels is defined differently, so the function np.squeeze is needed. Look at the example below, the np.squeeze is necessary:

>>> a = np.append(np.ones((3,1)), np.zeros((3,1)), axis = 0)
>>> a
array([[1.],
       [1.],
       [1.],
       [0.],
       [0.],
       [0.]])
>>> a.tolist()
[[1.0], [1.0], [1.0], [0.0], [0.0], [0.0]]
>>> np.squeeze(a).tolist()
[1.0, 1.0, 1.0, 0.0, 0.0, 0.0]

Best,
Wesley P.

Thank you. That makes sense.