Training a Simple NN on the Coffee Roasting Data

Jeremy_Epstein · November 8, 2023, 1:15pm

I am training a neural net to classify the coffee roasting data. I am using 2 layers with 3 neurons for the first and 1 for the final layer. The layers use ReLU and sigmoid activations, respectively. The network isn’t classifying correctly and I just want to confirm my intuition that it ought to work. My understanding is that because the data can be classified using 3 decision boundaries that this architecture is sufficient. I just want to verify that is so.

If so, I can share my code.

rmwkwok · November 8, 2023, 2:55pm

Hi @Jeremy_Epstein, I believe you are referring to the data used in one of the optional labs. The optional lab also used 3 units and 1 unit in the two layers, but sigmoid in both. So, here, your change is not about the number of neurons or layers, but the activation functions.

I suggest you to start from that lab, choose only one thing to change, observe the difference in performance, make hypotheses if it becomes worse and experiment what other changes are needed to make it better.

A hypothesis would be like “given that I have only changed the activation in the first layer from sigmoid to ReLU, the performance is worsen, and that could be due to …”

If you have no idea on what to do, I suggest you to make a list of things you can tune, and start playing with them, and if something does make any improvement, then see if you can make sense of it. If you can make sense of it, then you can fill up that last missing statement in the above hypothesis template.

Good luck with your experiments!
Raymond

TMosh · November 8, 2023, 6:12pm

@Jeremy_Epstein. please post a reply if you need additional clarification.

Jeremy_Epstein · November 10, 2023, 1:11pm

I copied the lab code and started comparing it to the code I had. In my code I did not perform the tiling step, which multiplies the size of the training data by a factor of 1000. Tiling my data made the model predict properly. An interesting alternative I found was to train the model for 1000 times more epochs and that also got the model to correctly predict. But notably, this was less time-efficient.

It seems my original model did not train on enough samples.

rmwkwok · November 10, 2023, 1:41pm

Thanks for sharing these enlightening results, @Jeremy_Epstein!

Raymond

Topic		Replies	Views
Coffee Roasting Example. How come 3 neurons with the same activation function (sigmoid) provided outputs for 3 different regions? Advanced Learning Algorithms week-1	4	327	November 16, 2023
How did the coffee roasting NN get trained? Advanced Learning Algorithms week-1	5	615	September 26, 2022
Neural networks are a totally different way of thinking about data Advanced Learning Algorithms week-1	4	619	July 7, 2022
C2_W1_Lab02_CoffeeRoasting_TF Counting of weights and tensorflow Advanced Learning Algorithms week-1	2	227	April 8, 2024
C2_W1_Lab02_CoffeeRoasting_TF - Why does adding additional neurons result in weird plots? Advanced Learning Algorithms week-1	29	630	January 31, 2024

Training a Simple NN on the Coffee Roasting Data

Related topics