Error with lab C2_W1_Assignment

Xingqiang_Chen · May 17, 2023, 10:02am

I wrote this code:

(Solution code removed, as posting it publicly is against the honour code of this community, regardless if it is correct or not)

And after applying:

#DO NOT MODIFY THIS CELL
word_l = process_data('./data/shakespeare.txt')
vocab = set(word_l)  # this will be your new vocabulary
print(f"The first ten words in the text are: \n{word_l[0:10]}")
print(f"There are {len(vocab)} unique words in the vocabulary.")

it returns:

The first ten words in the text are: 
['o', 'for', 'a', 'muse', 'of', 'fire', 'that', 'would', 'ascend', 'the']
There are 6303 unique words in the vocabulary.

but Expected Output

The first ten words in the text are: 
['o', 'for', 'a', 'muse', 'of', 'fire', 'that', 'would', 'ascend', 'the']
There are 6116 unique words in the vocabulary.

UPDATE:
It is solved. Just need to apply re.findall instead of all the stuff I did

Mubsi · May 17, 2023, 10:19am

Hi @Xingqiang_Chen,

Did you copy paste this code from somewhere ?

Your code cell doesn’t have the same instructions as the ones that are present in the assignment’s Ex 1.

def process_data(file_name):
    """
    Input: 
        A file_name which is found in your current directory. You just have to read it in. 
    Output: 
        words: a list containing all the words in the corpus (text file you read) in lower case. 
    """
    words = [] # return this variable correctly

    ### START CODE HERE ### 
    
    #Open the file, read its contents into a string variable
    
    # convert all letters to lower case
    
    #Convert every word to lower case and return them in a list.
    
    ### END CODE HERE ###
    
    return words

Xingqiang_Chen · May 17, 2023, 10:21am

When I am blocked in some place, I use to delete all and start again and I have already solved it!

Topic		Replies	Views
C2_W1 Assignment 1: Autocorrect - Exercise 1 NLP with Probabilistic Models week-module-1	1	85	December 23, 2024
Stuck at process_data NLP with Probabilistic Models week-module-1	9	700	July 13, 2023
Problem obtaining unique words NLP with Probabilistic Models week-module-1	4	530	April 21, 2023
Stuck on 1st exercice NLP with Probabilistic Models week-module-1	1	581	January 17, 2022
C3W1_Practice_Assignment build_vocabulary Failed test case: vocab does not contain all words NLP with Sequence Models week-module-1	12	86	May 4, 2025

Error with lab C2_W1_Assignment

Related topics