C3W2 preprocess_dataset - dataset.map

swat6296 · December 7, 2025, 12:41pm

Hello.

All the unit tests up to the “GRADED FUNCTION: preprocess_dataset” are passed successfully, but I do not seem to understand how to use either the map, the lambda function, or the dataset structure correctly. I have looked at other posts but have not found the answer.

As I understand the task:

We receive the dataset as an argument, which consists of training texts and training labels. Our task is to use previously created vectorizer and label encoders to process texts and labels respectively. The encoder and the vectorizer are already pre-trained by the time they are passed to this function, so we don’t need to train them.

After that we need to batch the dataset into batches of 32.

I am trying to follow the hint of using the .map method, using the following command:

    dataset = dataset.map(lambda a, b, : text_vectorizer(a), label_encoder(b))

Alternative text description in case the code is against the rules: a lambda function passed as an argument for the map method. Lambda function takes two arguments and applied text_vectorizer to the first and label_encoder to the second

However, I get the error “name ‘b’ is not defined“, meaning that I can not extract the labels that way.

Is one of my assumptions incorrect, or is this an incorrect way to handle the dataset?

Deepti_Prasad · December 7, 2025, 1:31pm

hi @swat6296

this comment post mentions how to use dataset and map to process your data, please go through this, i also have provided how to correct the code, and if it’s matches how I mentioned to write the code and still you have got label is not defined, then one needs to check train_val_dataset codes.

let me know if you want me to review your codes still.

Regards

DP

swat6296 · December 7, 2025, 1:46pm

After googling the lambda syntax I found the issue:
The section after the semicolon can be interpreted by python as a series of arguments, and to avoid this it has to be encased in brackets, otherwise lambda and everything after is interpreted as the first argument for the .map function and “label_encoder(b)“ is interpreted as a second argument, instead of the part of lambda.
I.e. a correct syntax for such a case would be:

result = dataset.map(lambda a, b :(func1(a), func2(b))

Topic		Replies	Views
C3W2_Assignment graded function preprocess_dataset Natural Language Processing in TensorFlow week-module-2	16	323	March 26, 2025
# GRADED FUNCTION: preprocess_dataset error - "name 'labels' is not defined" NLP with Sequence Models week-module-2	2	104	October 16, 2024
# GRADED FUNCTION: preprocess_dataset Natural Language Processing in TensorFlow	21	502	October 15, 2024
Need help on excersie 4 of c3w2 assignment Natural Language Processing in TensorFlow	2	97	September 27, 2024
C3_W2_Assignment Natural Language Processing in TensorFlow week-module-2	9	115	January 8, 2025

C3W2 preprocess_dataset - dataset.map

Related topics