Well, there are several things that you are doing differently than the code you copied from the TensorFlow Introduction exercise:
- You are not using a real TF cost function. You could use the TF
BinaryCrossentropy
loss function there. But note that you have to play the same games to deal with the fact that the TF functions all expect “samples first” data orientation. But I would expect they can compute gradients of your loss function as well. - You are handling the elements of the
parameters
dictionary differently by using a list of the “values()” in the dictionary. They get specific references to the entries of the dictionary and use those. There are some subtleties about how object references in python work. It looks like the fundamental problem is that your parameters are just not getting updated, which is why I’m pointing out this difference. Not sure it’s the real cause of the issue, but something to consider carefully.
Here’s a thread which talks about some of the pitfalls with object references. But note that it’s warning you about the exact opposite scenario: in your case here you want to be referencing the global objects that are the elements of the parameters
dictionary. In the case discussed on that other thread, the point is you want to break the link between the global values and the values you are modifying locally.