Anyone else managed to train the model locally from W1's assignment?

Hi!

I’ve tackled the assignment and achieved the full grade and actually migrated most of the assignment to TF 2.6 and wanted to test the training locally on my local GPU — the optional task.

I can’t share the graded code for obvious reasons but I wanted to check if anyone else tried this?

I’m not an expert at model structuring but when I ran model.fit I got an error:

    TypeError: Input 'y' of 'Mul' Op has type float32 that does not match type int64 of argument 'x'.

Here’s are the code snippets

x = base_model.output

# add a global spatial average pooling layer
x = GlobalAveragePooling2D()(x)

# and a logistic layer
predictions = Dense(len(classes), activation="sigmoid")(x)

model = Model(inputs=base_model.input, outputs=predictions)
# add custom loss function
model.compile(optimizer='adam', loss=get_weighted_loss(weights_pos, weights_neg), metrics=['accuracy'])

history = model.fit(
    train_generator, 
    validation_data=valid_generator, 
    steps_per_epoch=100, 
    validation_steps=25,
    epochs = 5)

plt.plot(history.history['loss'])
plt.ylabel("loss")
plt.xlabel("epoch")
plt.title("Training Loss Curve")
plt.show()

Use model.fit_generator() instead of model.fit(). model.fit() is used when no dataset generator is needed.

model.fit_generator() has been deprecated since TF 2.1.x

https://www.tensorflow.org/versions/r2.1/api_docs/python/tf/keras/Model#fit

I’m using 2.6 as clarified above after installing lambda stack.

I mean I would like to stick with 2.1.x documentation as much as I can because that’s also what I use at work.

I just realised that the custom_weighted_loss function that the assignment is using is very hacky indeed. Is iterating through each class and calculating the individual K.means the proper method?

Typically to use custom loss functions in TF2.x the loss function needs to return (batch_size, ) instead of the scalar loss.

If you could provide some guidance on how to write a custom weighted loss function for a multiclass model I would be deeply appreciative!

The issue is that the data type is not matching with datatype of x. It is always a good advice to always use float32 as a datatype but I can see that your data has also some integer value. Try to typecast these integer values to float 32. Does your y have values like 1,2,3,4…etc instead 1.,2.,3.,4. …etc? If yes, typecast these values to float 32.

I think the issue is the return type of loss functions for TF1.x

I looked around the model layers for a int64 type and ascertained its from the batch_size, although I’m not entirely sure how to confirm this.

I ran the model without the custom weighted categorical entropy function, loss=categorical_crossentropy and got an ascending training loss graph:

When I defined my own custom loss function (note: this returns a vector of (batchsize,), e.g. ((0.1) (0.18) (0.21) for a batch size of 3) which is how its done in TF2.x.

def get_weighted_categorical_crossentropy(weights):
    def wcce(y_true, y_pred):
        Kweights = K.constant(weights)
        y_true = K.cast(y_true, y_pred.dtype)
        return K.categorical_crossentropy(y_true, y_pred) * K.sum(y_true * Kweights, axis=-1)
    return wcce

Do let me know if publishing this code snippet contravenes Coursera’s ethics. I hope it wouldn’t because the original answers returns a scalar loss value. I would love feedback on how to flatten the original solution into a function that returns an ndarray of (batchsize,) loss values, instead of the scalar loss value.

I still got an ascending training loss graph.

the model training parameters are almost all exactly the same as the assignment save for reducing the batch_size to 16 to fit my GPU vram (10GB)

1 Like

the int64 data type is not your batch_size. I think it might be your y_true and y_pred. Can you post the data type of y_pred, y_true and the input to your model?

Add a (-)ve sign to the return K.categorical_crossentropy(y_true, y_pred) * K.sum(y_true * Kweights, axis=-1) . It should decrease the loss function