C1_W1Natural Language Processing with Classification and Vector Spaces Exercise 5

Yolande_Tra · January 17, 2025, 7:52pm

When using the process_tweet function, I got an error even though earlier in the assignment, all tests pass for this function.
print([type(tweet) for tweet in test_x[:5]])
TypeError Traceback (most recent call last)
in
----> 1 tmp_accuracy = test_logistic_regression(test_x, test_y, freqs, theta)
2 print(f"Logistic regression model’s accuracy = {tmp_accuracy:.4f}")

in test_logistic_regression(test_x, test_y, freqs, theta, predict_tweet)
18 for tweet in test_x:
19 # get the label prediction for the tweet
—> 20 y_pred = predict_tweet(test_y, freqs, theta)
21
22 if y_pred > 0.5:

in predict_tweet(tweet, freqs, theta)
12
13 # extract the features of the tweet and store it into x
—> 14 x = extract_features(tweet, freqs)
15
16 # make the prediction using x and theta

in extract_features(tweet, freqs, process_tweet)
9 ‘’’
10 # process_tweet tokenizes, stems, and removes stopwords
—> 11 word_l = process_tweet(tweet)
12
13 # 3 elements for [bias, positive, negative] counts

~/work/utils.py in process_tweet(tweet)
19 stopwords_english = stopwords.words(‘english’)
20 # remove stock market tickers like $GE
—> 21 tweet = re.sub(r’$\w*‘, ‘’, tweet)
22 # remove old style retweet text “RT”
23 tweet = re.sub(r’^RT[\s]+', ‘’, tweet)

/opt/conda/lib/python3.7/re.py in sub(pattern, repl, string, count, flags)
190 a callable, it’s passed the Match object and must return
191 a replacement string to be used.“”"
→ 192 return _compile(pattern, flags).sub(repl, string, count)
193
194 def subn(pattern, repl, string, count=0, flags=0):

TypeError: cannot use a string pattern on a bytes-like object

paulinpaloalto · January 17, 2025, 8:42pm

When you are invoking predict_tweet there, you are supposed to be passing one tweet as the first argument, but you are passing an array of the labels for the tweets, which is why you get that error message.

A perfectly correct function can still throw errors if you call it incorrectly.

Yolande_Tra · January 17, 2025, 8:53pm

Thanks. I saw the error. I replaced test_y to just tweet. The error is fixed.

Topic		Replies	Views
C1_W1 Natural Language Processing with Classification and Vector Spaces Exercise 4 Test function Error NLP with Classification and Vector Spaces week-1	3	21	January 15, 2025
TypeError: expected string or bytes-like object NLP with Classification and Vector Spaces week-1	1	385	October 1, 2023
Predict_tweet() function NLP with Classification and Vector Spaces week-1	2	544	March 7, 2022
C1_W1 Wrong Predicted values (predict_tweet) NLP with Classification and Vector Spaces week-1	3	384	September 14, 2023
Key error on training naive bayes NLP with Classification and Vector Spaces week-2 , week-3	2	493	May 19, 2023

C1_W1Natural Language Processing with Classification and Vector Spaces Exercise 5

Related topics