Course 4, Week 1, programming assignment 2: cannot replicate results locally

Hi,

I downloaded the notebook for Course 4, Week 1, programming assignment 2 (Convolutional Model Application). In order to be able to play around with the code a little bit, I copied everything from the notebook to a Python script, that I then run from the terminal. The script is in the same local folder as the notebook, so it uses the same Python modules and data that the notebook uses, and it also uses the same version of TensorFlow, and the same seed.

When I run the downloaded notebook locally, the results are very close to as they are when I run the notebook online. No surprises there. However, when I run the script locally, I notice something strange: even though I have copied the code verbatim from the notebook, I cannot replicate the results for exercise 1 (happyModel) - they are much worse (final accuracies around 50%). I do not have this problem with the results for exercise 2 (convolutional_model).

I am wondering if anyone can explain this to me. I must be overlooking something.

Thanks for any help!

I tried to copy all into my local py file, and made it run in my local environment. Here is the result.

This output is from my terminal on Mac. The version of tensor flow is 2.9. (much newer version, though…)

This was the highest. The results are between 0.85-0.93.

@anon57530071 Thanks for your reply!

This is very different from what I am getting:

BUT - I just noticed the strange loss values: they start out very low, and don’t change at all. That could be a hint, but I am not sure for what…

My local TensorFlow version is 2.3.0, if that matters at all.

The assignments are designed to work with a specific version of TensorFlow.

There are significant changes in TensorFlow versions, even in the layer definitions and default parameters.

If you use your own installation of TensorFlow, you can expect to have to sort out the appropriate changes in the notebook files.

@Reinier_de_Valk
This small loss value may not be possible by a BCE loss function. I guess a model is NOT compiled properly. Can you double check ?

@TMosh But how would that explain that the results are as expected when I run the downloaded IPython notebook? That would make use of the same TensorFlow installation, wouldn’t it?

BTW: I think I installed the version used in the online notebooks on my local machine before I started the course.

@anon57530071

I just discovered what causes the strange behaviour. In my script, I had changed the order of the commands a little bit, placing compile() first:

	happy_model = happyModel()
	happy_model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])	
	happy_model.summary()
	for layer in summary(happy_model):
		print(layer)
	happy_model.fit(X_train, Y_train, epochs=10, batch_size=16)
	happy_model.evaluate(X_test, Y_test)

If I change this back to the original order as given in the online notebook, the results are as expected:

    happy_model = happyModel()
    for layer in summary(happy_model):
        print(layer)
    happy_model.compile(optimizer='adam',
                        loss='binary_crossentropy',
                        metrics=['accuracy'])
    happy_model.summary()
    ...

I had a look into test_utils.summary(), and I noticed that the model is also compiled in there - but this time using categorical cross-entropy! Is this an error? Commenting out the lines where the latter method is called fixes the problem, as does replacing the categorical_crossentropy with binary_crossentropy:

	happy_model = happyModel()
	happy_model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
	happy_model.summary()
#	for layer in summary(happy_model):
#		print(layer)
	happy_model.fit(X_train, Y_train, epochs=10, batch_size=16)
	happy_model.evaluate(X_test, Y_test)

This also explains why using the same sequence (first compile(), then summary(), then summary(happy_model) for the convolutional_modelgave no problems: in this case, categorical cross-entropy is used as the loss function.

@Reinier_de_Valk

If you see “test_utils.py”, in summary(), it compiles a model with loss=‘categorical_crossentropy’.
So, it should work for the 2nd one, but not for the first one.
In any cases, it is better not to call this, or, compile your model after this.

@anon57530071 I just noticed this myself; see my edit.

In any case: mystery solved. Thanks for your help!