C3W2_Assignment (TensorFlow course, BBC dataset) and model fitting

ha5dzs · March 21, 2022, 10:26am

I hope I posted this to the correct category.

This is for the ‘TensorFlow Developer Professional Certificate’, Course 3, Week 2 assignment.

So I got functions train_val_split(), fit_tokenizer(), seq_and_pad(), tokenize_labels() working, and all of these show the expected output.

But, I seem to struggle with the model fitting.

So, I created my model as such:

Model: "sequential_19"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 embedding_19 (Embedding)    (None, 120, 16)           16000     
                                                                 
 global_average_pooling1d_10  (None, 16)               0         
  (GlobalAveragePooling1D)                                       
                                                                 
 dense_27 (Dense)            (None, 16)                272       
                                                                 
 dense_28 (Dense)            (None, 5)                 85        
                                                                 
=================================================================
Total params: 16,357
Trainable params: 16,357
Non-trainable params: 0

but it fails, because the subsequent layers after embedding do not get the data in the correct dimensions:
ValueError: Shapes (None, 1) and (None, 5) are incompatible

I noticed that the embedding layer does not produce the 16-dimension output, no matter what I do.

I verified that my variables and input arguments are the correct size and correct type:

train_padded_seq: 1780; <class 'numpy.ndarray'>
train_label_seq: 1780; <class 'numpy.ndarray'>
val_padded_seq: 445; <class 'numpy.ndarray'>
val_label_seq: 445; <class 'numpy.ndarray'>

What am I doing wrong?

Funny thing is, when I change the last Dense layer to have only 1 neuron in it, the training will succeed, but of course I am getting garbage at the output.

Michael_Laurberg_Jen · March 22, 2022, 1:14pm

Hi ha5dzs,

I had the exact same issue with my assignment, but I was able to fix it by re-checking and re-thinking my choice of loss function. Think about the difference between ‘binary’ and ‘categorical’ and relate to the error message you are getting

regards,
Michael

ha5dzs · March 22, 2022, 1:48pm

Hi,

Well, that’s what I thought at first, but no, my optimizer, loss function and the activation pattern matches with categorical data.

For a laugh, I submitted the assignment with the non-working network, and still got 80%, so carried on with next week.

I thought I had a similar issue with C3W3 as well, but that turned out to be something I forgot to do with the labels.

These error messages are very cryptic.

Edoardo_Guarnerio · March 22, 2022, 7:06pm

Mind the difference between sparse_caegorical_entropy and categorical_entropy

MayankGhogale · March 23, 2022, 7:42am

I do think that the loss function should be sparse categorical entropy and not categorical entropy as we dont use one hot encoding here I believe
Please refer to the link below for where to use sparse categorical and where to use categorical entropy loss

Hope this helps

Thanks and Regards,

Mayank Ghogale

ha5dzs · March 24, 2022, 8:08am

Yep, this did the trick. When I set my loss function to 'sparse_categorical_crossentropy', everything suddenly works.

It would be nice to have a development environment that shows what are the options. Oh well, thanks for this!

MayankGhogale · March 24, 2022, 10:33am

I am glad it worked for you:)

Jeffrey_M_Miller · March 30, 2022, 11:19pm

I’m having the same issue even though I’m using sparse_categorical_crossentropy. Any other suggestions?

ValueError: Data cardinality is ambiguous:
x sizes: 1780
y sizes: 445
Make sure all arrays contain the same number of samples.

MayankGhogale · March 31, 2022, 6:48am

Sir your X and Y arrays are not of the same size…that is you have less number of labels than the number of samples
Can you send me your code as pdf form on pm by clicking on my name
Thanks

Dan_D · April 6, 2022, 1:13am

Thank you @MayankGhogale ! I ran into the same error, replaced ‘categorical_crossentropy’ with ‘sparse_categorical_crossentropy’ and suddenly it worked.

Lesson learned: use ‘sparse_categorical_crossentropy’ and avoid one-hot encoding (which is simple but breaks the flow for submission.)

MayankGhogale · April 6, 2022, 1:16am

Glad it worked for you sir…

DaveGillie · June 26, 2022, 3:03am

I have an understanding of categorical_crossentropy vs sparse_categorical_crossentropy, so I got the (None, 1) and (None, 5) error when I used categorical_crossentropy, but correctly using sparse_categorical_crossentropy got me another huge and very cryptic error about other sizes not matching up. I scoured through my notebook and realized that the test cases are not quite exhaustive enough to let me know that I had implemented a previous function incorrectly. I did not use all of the given function parameters when instantiating the Tokenizer. This oversight on my part did not manifest until trying to fit the model.
If you’re having errors in spite of using sparse_categorical_crossentropy, go through your previous implementations of functions in the notebook and be very careful to use all of the given parameters.
Or maybe I’m the only one who made this mistake…

Leandro_dos_Santos_N · August 29, 2023, 7:50pm

This is very frustrating because this loss function, ‘SparseCategoricalCrossentropy’, was never presented to us. It was not even mentioned in the lectures.
I thought that the labels fed to the network had to be in the same format of the softmax output layer.

Topic		Replies	Views
[solved] C3W3_Assignment - trouble with the Embedding layer Natural Language Processing in TensorFlow	1	518	March 23, 2022
C3W4 assignment Natural Language Processing in TensorFlow week-4	2	291	March 25, 2024
C3W3 - Assignment - Error on fit Natural Language Processing in TensorFlow week-3	3	46	December 20, 2023
C3W2_Assingment model.fit error Natural Language Processing in TensorFlow week-2 , week-3 , week-4	3	583	September 2, 2022
The output did not match C2W3 assignment Convolutional Neural Networks in TensorFlow week-3	2	323	November 27, 2023

C3W2_Assignment (TensorFlow course, BBC dataset) and model fitting

Related topics