I have my own neural network implementation in Rust, as example I used code from course 1 task.
I fixed all performance issues and now I got issue with calculations. So, first of all this is what I got as cost (cross entropy):
Cost after: 0 iteration: 0.8088958440701614
Cost after: 1 iteration: 0.8237759870051128
Cost after: 2 iteration: 0.840735343330898
Cost after: 3 iteration: 0.8602733776819088
Cost after: 4 iteration: 0.8830879543258277
Cost after: 5 iteration: 0.9101905243122623
Cost after: 6 iteration: 0.9431160410947709
Cost after: 7 iteration: 0.9843361269280926
Cost after: 8 iteration: 1.0381577297095745
Cost after: 9 iteration: 1.1129604205428372
Cost after: 10 iteration: 1.2279559478614726
Cost after: 11 iteration: 1.4408517713809594
Cost after: 12 iteration: 2.0457278644293826
Cost after: 13 iteration: 10.26770324918441
Cost after: 14 iteration: -0.0
Cost after: 17 iteration: -0.0
Cost after: 18 iteration: -0.0
Cost after: 19 iteration: -0.0
Cost after: 20 iteration: -0.0
Cost after: 21 iteration: -0.0
Cost after: 22 iteration: -0.0
Cost after: 23 iteration: -0.0
Cost after: 24 iteration: -0.0
Accuracy: 0.49999999999999917
As you could see, cost is growing.
So, I tried to use different learning_rate
, different layers set, also chat GPT adviced me to add epsilon
to my dAL
calculations to avoid getting NaN
or inf
. So, if we get this code from example:
dAL = - (np.divide(Y, AL) - np.divide(1 - Y, 1 - AL))
then chatGPT adviced to add epsilon = 1e-8
to this second part:
np.divide((1 - Y) + epsilon, (1 - AL) + epsilon))
In other words, I did all I can do to make my cost reduce. But had no success.
Could somebody explain what could be wrong ? Not sure if I allowed to share the code, so let me know if I can share extra details.
P.S. more info:
- my network structure is [ReLU, ReLU, Sigmoid] (same as in example, only last layer has Sigmoid activation func)
- as dataset I got pictures with cats and dogs, when I built dataset, I made input = (28x28x3, ), output = (1, ). Output has 0.0 and 1.0, where 0 is cat, 1 is dog.
- Accuracy always same (no matter how much iterations)