Anyone else managed to train the model locally from W1's assignment?

lackdaz · September 5, 2021, 11:59am

Hi!

I’ve tackled the assignment and achieved the full grade and actually migrated most of the assignment to TF 2.6 and wanted to test the training locally on my local GPU — the optional task.

I can’t share the graded code for obvious reasons but I wanted to check if anyone else tried this?

I’m not an expert at model structuring but when I ran model.fit I got an error:

    TypeError: Input 'y' of 'Mul' Op has type float32 that does not match type int64 of argument 'x'.

Here’s are the code snippets

x = base_model.output

# add a global spatial average pooling layer
x = GlobalAveragePooling2D()(x)

# and a logistic layer
predictions = Dense(len(classes), activation="sigmoid")(x)

model = Model(inputs=base_model.input, outputs=predictions)
# add custom loss function
model.compile(optimizer='adam', loss=get_weighted_loss(weights_pos, weights_neg), metrics=['accuracy'])

history = model.fit(
    train_generator, 
    validation_data=valid_generator, 
    steps_per_epoch=100, 
    validation_steps=25,
    epochs = 5)

plt.plot(history.history['loss'])
plt.ylabel("loss")
plt.xlabel("epoch")
plt.title("Training Loss Curve")
plt.show()

me_sajied · September 5, 2021, 1:42pm

Use model.fit_generator() instead of model.fit(). model.fit() is used when no dataset generator is needed.

lackdaz · September 5, 2021, 2:07pm

model.fit_generator() has been deprecated since TF 2.1.x

https://www.tensorflow.org/versions/r2.1/api_docs/python/tf/keras/Model#fit

I’m using 2.6 as clarified above after installing lambda stack.

I mean I would like to stick with 2.1.x documentation as much as I can because that’s also what I use at work.

lackdaz · September 5, 2021, 3:27pm

I just realised that the custom_weighted_loss function that the assignment is using is very hacky indeed. Is iterating through each class and calculating the individual K.means the proper method?

Typically to use custom loss functions in TF2.x the loss function needs to return (batch_size, ) instead of the scalar loss.

If you could provide some guidance on how to write a custom weighted loss function for a multiclass model I would be deeply appreciative!

sbansal793 · September 6, 2021, 6:37pm

The issue is that the data type is not matching with datatype of x. It is always a good advice to always use float32 as a datatype but I can see that your data has also some integer value. Try to typecast these integer values to float 32. Does your y have values like 1,2,3,4…etc instead 1.,2.,3.,4. …etc? If yes, typecast these values to float 32.

lackdaz · September 7, 2021, 3:29am

I think the issue is the return type of loss functions for TF1.x

I looked around the model layers for a int64 type and ascertained its from the batch_size, although I’m not entirely sure how to confirm this.

I ran the model without the custom weighted categorical entropy function, loss=categorical_crossentropy and got an ascending training loss graph:

When I defined my own custom loss function (note: this returns a vector of (batchsize,), e.g. ((0.1) (0.18) (0.21) for a batch size of 3) which is how its done in TF2.x.

def get_weighted_categorical_crossentropy(weights):
    def wcce(y_true, y_pred):
        Kweights = K.constant(weights)
        y_true = K.cast(y_true, y_pred.dtype)
        return K.categorical_crossentropy(y_true, y_pred) * K.sum(y_true * Kweights, axis=-1)
    return wcce

Do let me know if publishing this code snippet contravenes Coursera’s ethics. I hope it wouldn’t because the original answers returns a scalar loss value. I would love feedback on how to flatten the original solution into a function that returns an ndarray of (batchsize,) loss values, instead of the scalar loss value.

I still got an ascending training loss graph.

the model training parameters are almost all exactly the same as the assignment save for reducing the batch_size to 16 to fit my GPU vram (10GB)

sbansal793 · September 7, 2021, 5:47am

the int64 data type is not your batch_size. I think it might be your y_true and y_pred. Can you post the data type of y_pred, y_true and the input to your model?

sbansal793 · September 7, 2021, 5:50am

Add a (-)ve sign to the return K.categorical_crossentropy(y_true, y_pred) * K.sum(y_true * Kweights, axis=-1) . It should decrease the loss function

Topic		Replies	Views
C1_W1_Assignment- Model does not learn on personal PC with GPU AI for Medical Diagnosis week-module-1	6	29	February 28, 2025
Course 4, Week 1, programming assignment 2: cannot replicate results locally Convolutional Neural Networks coursera-platform	8	537	June 7, 2022
Stuck on Course 2 Week 2 - Model was not compiled Custom and Distributed Training with TF week-module-2	2	539	December 10, 2022
C3W2 Assignment: Diving deeper into the BBC News archive Natural Language Processing in TensorFlow week-module-4	3	474	December 9, 2022
Breast Cancer Prediction Gradient problem Custom and Distributed Training with TF week-module-2	2	540	October 31, 2022

Anyone else managed to train the model locally from W1's assignment?

Related topics