Forward propagation - Coffee machine

gmazzaglia · June 1, 2024, 3:53pm

Hello,
I’m wondering how the model can determine the triangle area shown in the next picture when, based on the lectures, you are applying a linear decision boundary with logistic regression (sigmoid function) and a threshold: prediction value <= 0.5.

You can predict with logistic regression something like this:

but, How can the model determine that the roasted bean process is good because the temperature is between 180 - 260 and the duration between 12 - 14, (triangle area)?

Thanks.
Regards.

TMosh · June 1, 2024, 5:23pm

If those images are from a lecture, please give the exact name and time mark.

If they are from a lab, please give the exact notebook file name.

Thanks!

gmazzaglia · June 1, 2024, 8:42pm

Hi @TMosh,

The image I’m asking for is located in the lab: C2_W1_Lab02_CoffeeRoasting_TF, section at the end: Layer Function.

My question is, How can the model determine if the roast process was correctly based on the triangle area when a linear decision boundary is applied with logistic regression?

Thanks.
Gus

TMosh · June 1, 2024, 10:41pm

I understand your question, i just asked for the notebook name because I only have access to the assignment’s repo, and not the lectures themselves. So I needed the notebook file name.

I’ll take a look and respond further soon.

TMosh · June 2, 2024, 4:05am

The three hidden layer units each identify a “bad roast” area, using linear boundaries. Each one forms a side of the triangle.

The three output units combine those “bad roast” areas into an overall summary, since the “good” sides of each boundary overlap.

gmazzaglia · June 2, 2024, 7:12pm

Hi @TMosh ,

Thanks for answering. I will continue with the course because in this Lab there are only 2 layers, input and output, and not 3 hidden layers. I think it’s explained later.

I will continue and come back if my doubts persist

Thanks.
Regards.

gmazzaglia · June 2, 2024, 11:12pm

Hi @TMosh ,

Can you please explain how to handle that? How can I define different linear boundaries per layer? Can you please show me an example?

Thanks.
Gus

TMosh · June 2, 2024, 11:16pm

You don’t have to define it. You just say how many units to use in the hidden layer, and let the NN learn the weights that will minimize the cost.

It just turns out in this simple example (which is why the lab uses it) that you get nice simple boundaries and it gives good results.

Once I get access to that lab again, I’ll try some experiments and see if I can come up with a better description.

gmazzaglia · June 2, 2024, 11:55pm

Thanks, @TMosh because it’s not clear to me how the NN can arrive at that triangular area.

TMosh · June 3, 2024, 12:07am

The lab shows the weights for the three output units in the NN - they’re the W2 and b2 values

You can work out what those are doing via some pencil and paper work.

If you haven’t worked with an NN before, then maybe revisit this topic later. This lab is just a “gee whiz!” example of what an NN can do. It doesn’t tell you how - that comes later.

Nevermnd · June 3, 2024, 12:10am

@gmazzaglia I don’t want to go out on a limb here… but I think the key insight is with regression, we are strictly dealing with two dimensions, so as you say, the problem has to be ‘linearly separable’–

However, once we start going to hidden layers in a more a NN kind of way, we are jumping to higher dimensions, even if in the end we map just back to that 2D space.

We’re still thinking about all the same 2D points we started with, but we are adding ‘depth’ to see how they might actually be related to one another.

Hard to explain because our minds don’t tend to work well in more than three dimensions, but that is how I understand it.

Nevermnd · June 3, 2024, 12:20am

Or we know there is a function for it because we can, ourselves, easily get there just by drawing a triangle. But in programming, or NNs we look at the points, not just as ‘2D’, but now in a hyperdimensional space and that is how we arrive at the shape of that equation.

Nevermnd · June 3, 2024, 12:51am

Note: I would hardly say what neural nets exactly are doing, is resulting in a Fourier Transform… But to just jog your mind in a completely different way, there is a function for almost anything…

gmazzaglia · June 3, 2024, 1:38am

Thanks, very very interesting the video.

varchodi · June 4, 2024, 10:36am

Well, i think that triangle is obtained by the parameters (w, b => wx+b) of those neurons in the hidden layer (not quite sure tho );

rmwkwok · June 5, 2024, 9:37am

Hello @varchodi, @gmazzaglia,

Here are some certain things:

Because we have 3 neurons in the 1st layer, we can draw 3 linear boundaries in the input space (which is a 2D space because input has 2 features).
When the NN is in its randomly initialized stage, the 3 boundaries are random, in other words, they don’t form that triangle.
As we train the NN, the w’s and b’s in each neuron keeps changing to minimize the loss. As the w’s and b’s change, the boundaries move!
Minimized loss results in that triangle.
This triangle is for our visualization only. The NN does not have this vision. Instead, the NN uses the 2nd layer to combine information from the 1st layer to make its final decision.
Continuing on 5, how does the 2nd layer’s neuron use the info? It has a trainable weight for each output of neurons in the 1st layer. These three trainable weights control how the aggregation is done. A good model will aggregate them in a way that, inside the triangle, the aggregated value is high, but low outside the triangle.

Make sense?

Cheers,
Raymond

varchodi · June 5, 2024, 2:50pm

Yeah;
Thanks for ur charifications;

Topic		Replies	Views
Training a Simple NN on the Coffee Roasting Data Advanced Learning Algorithms week-2	4	336	November 10, 2023
C2_W1_Lab02_CoffeeRoasting_TF - Logistic regression vs Neural Network Advanced Learning Algorithms week-1	3	31	December 10, 2024
Getting the perfect inputs for the highest activation of the output layer Advanced Learning Algorithms week-1	3	21	July 24, 2024
First layer output of coffee roasting tensorflow lab Advanced Learning Algorithms week-1	3	484	April 11, 2023
Neural networks are a totally different way of thinking about data Advanced Learning Algorithms week-1	4	619	July 7, 2022

Forward propagation - Coffee machine

Related topics