TypeError: expected string or bytes-like object

UNQ_C4 GRADED FUNCTION: predict_tweet

def predict_tweet(tweet, freqs, theta):
tweet: a string
freqs: a dictionary corresponding to the frequencies of each tuple (word, label)
theta: (3,1) vector of weights
y_pred: the probability of a tweet being positive or negative

code snippet removed

TypeError Traceback (most recent call last)
1 # Run this cell to test your function
2 for tweet in [‘I am happy’, ‘I am bad’, ‘this movie should have been great.’, ‘great’, ‘great great’, ‘great great great’, ‘great great great great’]:
----> 3 print( '%s → f' (tweet, predict_tweet(tweet, freqs, theta)))

in predict_tweet(tweet, freqs, theta)
16 # extract the features of the tweet and store it into x
—> 17 x = extract_features(process_tweet(freqs))
19 # make the prediction using x and theta

~/work/utils.py in process_tweet(tweet)
19 stopwords_english = stopwords.words(‘english’)
20 # remove stock market tickers like $GE
—> 21 tweet = re.sub(r’$\w*‘, ‘’, tweet)
22 # remove old style retweet text “RT”
23 tweet = re.sub(r’^RT[\s]+', ‘’, tweet)

/opt/conda/lib/python3.7/re.py in sub(pattern, repl, string, count, flags)
190 a callable, it’s passed the Match object and must return
191 a replacement string to be used.“”"
→ 192 return _compile(pattern, flags).sub(repl, string, count)
194 def subn(pattern, repl, string, count=0, flags=0):

TypeError: expected string or bytes-like object


process_tweet returns a list. Note that process_tweet is used in extract_features (so you don’t need it in predict_tweet), where it expects a string.
You may have become confused because of the docstring of extract_features that states that tweet is a list of words. This is incorrect, tweet is a string. I will make an issue of that at the backend so that the docstring is changed. I hope this helps.