C3W2 Assignment - fit_label_encoder

danidrimbe · April 2, 2025, 10:54am

Hello,

I have problems with Exercise 3 where we need to complete the function fit_label_encoder. The error that I get is: Slicing dataset elements is not supported for rank 0.

Maybe my line code where I defined labels is incorrect. I define labels using
the function tf.data.Dataset.from_tensor_slices() directly with the pair (train_labels,validation_labels) as input.

Can someone help me? Thank you in advance.

balaji.ambresh · April 2, 2025, 5:21pm

MapDataset supports concatenate operation. Does this hint help?

danidrimbe · April 2, 2025, 8:16pm

My understanding is that I should concatenate train_labels and validation_labels and the way to do this is via the tf.data.Dataset.from_tensor_slices() function. The error I receive is that “Slicing dataset elements is not supported for rank 0”. It is not clear to me what exactly I am doing wrong.

balaji.ambresh · April 3, 2025, 7:24am

Label encoder is supposed to adapt to labels across both training and validation sets. If training labels = ['label 1', 'label 2'] and validation labels as ['label 1', 'label 3'], we should get the encoder to learn from ['label 1', 'label 2', 'label 1', 'label 3']. Dataset.from_tensor_slices doesn’t do this. See the difference:

>>> import tensorflow as tf
>>> train_labels = ['label 1', 'label 2']
>>> val_labels = ['label 1', 'label 3']
# this creates a 2D array
>>> print(np.asarray(list(tf.data.Dataset.from_tensor_slices([train_labels, val_labels]).as_numpy_iterator())))
[[b'label 1' b'label 2']
 [b'label 1' b'label 3']]
 
# this creates a 1D array which is what we want
>>> train_dataset = tf.data.Dataset.from_tensor_slices(train_labels)
>>> val_dataset = tf.data.Dataset.from_tensor_slices(val_labels)
>>> print(list(train_dataset.concatenate(val_dataset).as_numpy_iterator()))
[b'label 1', b'label 2', b'label 1', b'label 3']

As far as the rank error is concerned, seems like you’re passing a scalar somewhere. tf.rank represents the number of dimensions of a tensor. Here are a few examples:

# scalar
>>> tf.rank(1).numpy()
0
# vector
>>> tf.rank([1]).numpy()
1
# 2D matrix
>>> tf.rank([[1]]).numpy()
2

danidrimbe · April 3, 2025, 4:57pm

I appreciate your responses, thanks a lot for these clarifications, but unfortunately I still did not complete the exercise. Moreover, I still have exactly the same error.

To summarize my work:

I use the decode_label function on the input train_labels and validation_labels
Following your suggestion, I apply tf.data.Dataset.from_tensor_slices() to train_labels and validation_labels and get train_dataset and val_dataset, respectively.
I define labels = train_dataset.concatenate(val_dataset)
I define label_encoder = tf.keras.layers.StringLookup(num_oov_indices=0) —-> I put num_oov_indices=0 in order to remove the OOV tokens.
I adapt label_encoder by using label_encoder.adapt(labels)

Could you point out to me which step is wrong? In particular, how can I pass a scalar somewhere since I still get the error that slicing dataset elements is not supported for rank 0?

balaji.ambresh · April 4, 2025, 4:55am

The steps below aren’t required since train_labels and validation_labels are already Datasets:

The reason I used Datasets.from_tensor_slices was to convert a python list to TensorSliceDataset.

danidrimbe · April 4, 2025, 6:58pm

Thanks a lot for your help, it works. Now I have the following confusion: why does the fit_label_encoder() function contain inside of it the following decode_labels() function if this is not actually non needed?

def decode_labels(label)
# Decode byte string to a UTF-8 string
label = tf.strings.unicode_decode(label, "UTF-8")
return label

# Apply the decode function to both train_labels and validation_labels
train_labels = train_labels.map(decode_labels)
validation_labels = validation_labels.map(decode_labels)

Is there an alternative solution where the decode_labels() function is actually used?

balaji.ambresh · April 5, 2025, 2:16pm

I don’t see a function decode_labels in C3W2 assignment starter code. Am I missing something?

danidrimbe · April 7, 2025, 2:55pm

I thought that the decode_labels() function was in the code from the beginning. How can I check this? Is there a way that I can have a completely new lab with no modifications that I can work on it?

balaji.ambresh · April 7, 2025, 6:21pm

Open your current notebook.
Go to “File->Rename”, and rename the notebook.
Go to the Lab Help menu (the question-mark inside a circle), and use “Get latest version”.
Go to “File->Open” menu, and open the new notebook.
Use the “Kernel->Restart & Clear All Output” command.

Now the new notebook is ready for you to use.

Note: Do not rename the new notebook. The grader always uses the notebook with the original file name.

Topic		Replies	Views
C3W2_Assignment Error Natural Language Processing in TensorFlow week-2 , ai-discussions , project	5	71	January 5, 2025
C3_W2_Assignment Natural Language Processing in TensorFlow week-2	9	67	January 8, 2025
C3W2 Assignment - preprocess_dataset Natural Language Processing in TensorFlow week-2	6	44	April 9, 2025
# GRADED FUNCTION: preprocess_dataset Natural Language Processing in TensorFlow	22	390	October 16, 2024
C3W2 Assignment Fit Label Encoder Natural Language Processing in TensorFlow week-2	2	105	October 22, 2024

C3W2 Assignment - fit_label_encoder

Related topics