Forward propagation - Coffee machine

Hello,
I’m wondering how the model can determine the triangle area shown in the next picture when, based on the lectures, you are applying a linear decision boundary with logistic regression (sigmoid function) and a threshold: prediction value <= 0.5.

You can predict with logistic regression something like this:

but, How can the model determine that the roasted bean process is good because the temperature is between 180 - 260 and the duration between 12 - 14, (triangle area)?

Thanks.
Regards.

1 Like

If those images are from a lecture, please give the exact name and time mark.

If they are from a lab, please give the exact notebook file name.

Thanks!

Hi @TMosh,

The image I’m asking for is located in the lab: C2_W1_Lab02_CoffeeRoasting_TF, section at the end: Layer Function.

My question is, How can the model determine if the roast process was correctly based on the triangle area when a linear decision boundary is applied with logistic regression?

Thanks.
Gus

2 Likes

I understand your question, i just asked for the notebook name because I only have access to the assignment’s repo, and not the lectures themselves. So I needed the notebook file name.

I’ll take a look and respond further soon.

2 Likes

The three hidden layer units each identify a “bad roast” area, using linear boundaries. Each one forms a side of the triangle.

The three output units combine those “bad roast” areas into an overall summary, since the “good” sides of each boundary overlap.

Hi @TMosh ,

Thanks for answering. I will continue with the course because in this Lab there are only 2 layers, input and output, and not 3 hidden layers. I think it’s explained later.

I will continue and come back if my doubts persist

Thanks.
Regards.

1 Like

Hi @TMosh ,

Can you please explain how to handle that? How can I define different linear boundaries per layer? Can you please show me an example?

Thanks.
Gus

1 Like

You don’t have to define it. You just say how many units to use in the hidden layer, and let the NN learn the weights that will minimize the cost.

It just turns out in this simple example (which is why the lab uses it) that you get nice simple boundaries and it gives good results.

Once I get access to that lab again, I’ll try some experiments and see if I can come up with a better description.

2 Likes

Thanks, @TMosh because it’s not clear to me how the NN can arrive at that triangular area.

1 Like

The lab shows the weights for the three output units in the NN - they’re the W2 and b2 values

You can work out what those are doing via some pencil and paper work.

If you haven’t worked with an NN before, then maybe revisit this topic later. This lab is just a “gee whiz!” example of what an NN can do. It doesn’t tell you how - that comes later.

@gmazzaglia I don’t want to go out on a limb here… but I think the key insight is with regression, we are strictly dealing with two dimensions, so as you say, the problem has to be ‘linearly separable’–

However, once we start going to hidden layers in a more a NN kind of way, we are jumping to higher dimensions, even if in the end we map just back to that 2D space.

We’re still thinking about all the same 2D points we started with, but we are adding ‘depth’ to see how they might actually be related to one another.

Hard to explain because our minds don’t tend to work well in more than three dimensions, but that is how I understand it.

2 Likes

Or we know there is a function for it because we can, ourselves, easily get there just by drawing a triangle. But in programming, or NNs we look at the points, not just as ‘2D’, but now in a hyperdimensional space and that is how we arrive at the shape of that equation.

2 Likes

Note: I would hardly say what neural nets exactly are doing, is resulting in a Fourier Transform… But to just jog your mind in a completely different way, there is a function for almost anything…

1 Like

Thanks, very very interesting the video.

1 Like

Well, i think that triangle is obtained by the parameters (w, b => wx+b) of those neurons in the hidden layer (not quite sure tho );

1 Like

Hello @varchodi, @gmazzaglia,

Here are some certain things:

  1. Because we have 3 neurons in the 1st layer, we can draw 3 linear boundaries in the input space (which is a 2D space because input has 2 features).

  2. When the NN is in its randomly initialized stage, the 3 boundaries are random, in other words, they don’t form that triangle.

  3. As we train the NN, the w’s and b’s in each neuron keeps changing to minimize the loss. As the w’s and b’s change, the boundaries move!

  4. Minimized loss results in that triangle.

  5. This triangle is for our visualization only. The NN does not have this vision. Instead, the NN uses the 2nd layer to combine information from the 1st layer to make its final decision.

  6. Continuing on 5, how does the 2nd layer’s neuron use the info? It has a trainable weight for each output of neurons in the 1st layer. These three trainable weights control how the aggregation is done. A good model will aggregate them in a way that, inside the triangle, the aggregated value is high, but low outside the triangle.

Make sense?

Cheers,
Raymond

4 Likes

Yeah;
Thanks for ur charifications;

2 Likes