C2_W1_Lab02_CoffeeRoasting_TF - Why does adding additional neurons result in weird plots?

Frazer_Mann · January 17, 2024, 7:22pm

Hi Raymond,

Thanks for the reply. Looks like I’m one step closer to figuring this out, thanks pal.

I’m a little confused with the above. If i type in the weights and bias for the Layer 1 unit 2, the decision boundary is still going through the origin, so its nowhere near the good roast data points (approx 200 [x-axis] and 12.5mins [y-axis]). I had assumed in the plots showing the blue and white regions from the juypter notebook that the library had de-normalised the decision boundary to bring it back to the original feature space?

I see the smiley face on my table next to the top row of Layer 1, unit 3 and 4.

I’m assuming the “No” in each of them is correct? The weights in 2nd layer are being offset by the bias term, so I’m assuming that means shaded region of these units

are not being factored in, which makes sense. But then I’m still struggling as to why it seems to have an impact on the final result.

Im also still struggling to see why the decision boundary was put right through the good roast points to begin with. Is it likely it got caught in a local minima and further epochs would have helped kick it out of that?

I’m assuming because the output from layer 1 is any decimal between 1 and 0, then its easier to look at the layer 2 and check what’s the range of values each neuron is likely to give and therefore determine which units will be most influential?
In our case units 0, 1 and 2 will have very negative values if they receive 1. If they receive 0 from layer 1, then they have positive values.

with the quote above, its therefore the “0” regions which are the sought after areas for our good roasts. So if you combine the shaded areas for these three you’re left with the white triangle which we are looking for.

and yet 3 and 4 have values close to 0 if the input from layer 1 neurons is 1, So their shaded region should not contribute alot if at all.

When they receive 0’s though, they have the same output as the others, so does this mean that since we have 5 units, and only 1 is shaded at the bottom, we can assume the output will have dark shading at the top of the triangle (Red zone) and slightly less shaded area below since 4 of the 5 units said it was still a good roast (orange zone).

I would still expect the output figure to look slightly more shaded at the bottom if the above was right, so I think I’m missing something.

Rgds

Fraz

rmwkwok · January 25, 2024, 3:37am

Hello Fraz @Frazer_Mann ,

Everything in that table is correct. The emoticon is because I wanted us to focus on the difference between (unit 3 & 4) and (unit 0 & 1 & 2). To be more specific, unit 3 & 4 have “No” and “No” in both rows, but unit 0 & 1 & 2 have “Yes” and “No”.

Very good observation for unit 0 & 1 & 2! I assume you have gone thru the same arguments for unit 3 & 4 and come up with the following results:

Here is the first conclusion: if we speak of the impact from unit 3 and the impact from unit 4 individually (without considering unit 0 & 1 & 2), then, after the bias, any one of them alone can only end up a positive value.

The above should answer the first part of my last question on their individual contributions:

However, for unit 3 & 4’s contributions among all units, as you have shown,

Doesn’t it seems like to be affected by unit 4? In fact, I think your explanation below is speaking the same idea.

Maybe it is not even a local minimum?
Here is my speculation:

We can see that the first 3 units of your weights are identical to the lab’s. I am not saying this happy coincidence is impossible, but it would take quite a lot of luck. In fact, I have trained a 5-unit model with 30 epochs too, and here are the weights I got:

None of the weights are identical to the lab’s.

So, would your weights be somehow not reflecting the actual local minimum model you wanted to find? Is there any step in your notebook that may somehow unintentionally modified the weights of a local minimum model? If you share your notebook with the others, what would the others reproduce?

These are the plots of my 5-unit model, and each unit is responsible for a different bad region.

Does it have to be “0”? In my plots of my 5-unit model, it is not the case. Why can it not be the case?

Cheers,
Raymond

rmwkwok · January 25, 2024, 3:45am

Fraz @Frazer_Mann, thank you for getting back with your work and your thought process. I have reorganized yours and mine to get them sorted. Please try to follow through my organized flow.

You mentioned the possibility that the model was not in a good enough minimum state, which is a very good point, and I have explained my speclation on why I am doubting the weights you have shared are NOT representing any minimum state at all.

Please go through that explanation, and I suggest you to also go through your notebook again, and find out why, after training a 5-unit model from fresh (as you did I suppose), the first 3 units of your weights are exactly the same as the lab’s 3-unit model result. This should not happen.

Good luck!
Raymond

Frazer_Mann · January 25, 2024, 11:12am

Hi Raymond,

Thanks very much for taking the time to reply.

Is it the fact that they are positive or that their magnitude is so low thats important?

As you have pointed out, the first row for units 0-3 is Yes, and on the 2nd row is No. I’m assuming if it was Yes on the 2nd row they would cancel each other out so you always need to have one row as Yes and one as No?

Very little i would assume.

Yeah this doesnt seem right. Im not sure exactly where i have messed this up. I have downloaded the notebook and uploaded it to my onedrive in case you want to have a look, see link below. Note: I’m pretty sure i commented out the step where you can actually load in the past weights. Python is still pretty new to me though, so maybe i made a mistake.

{link removed by mentor}

Not sure i follow. I populated the table again for your weights.

Thanks again.

Rgds

Fraz

rmwkwok · January 27, 2024, 2:46am

Hello Fraz @Frazer_Mann,

I prefer to stick with my original way of how to look at their contributions, which is essentially that, very differently, both of their rows turn out to be always “No” (I don’t want to repeat my arguments in detail over again).

I prefer that because simply discussing whether they are positive or whether the magnitude is small do not completely reflect my original way, so that would not be how I will make any conclusion.

It is tempting to conclude it with simple concept like “magnitude”, but it is also dangerous to do so, because when you say “magnitude is so low”, then everyone can ask “how low is low?”, so even though “magnitude” is a convenient concept, saying “magnitude is so low” can be inaccurate too. For this discussion, I will just stick to my original way.

The table is for you to inspect the behavior of each unit. The table does not provide for any rule that any good neural network should comply with. Therefore, I will not ask question like “you always need to have one row as XXX and one as XXX?”, because the word “need” implies some necessary condition which is not what the table means to give you.

Again, the table means to let you inspect the behavior of the units.

Then it is your exercise.

Btw1: Sharing the notebook won’t tell everything.
Btw2: There was a link to a file that was from one of the labs of the course, and by the code of conduct, we can’t share that, so I have removed it for you.

You said previously the following

So, I asked back that if you look at my following plots (not yours),

Following your original logic to conclude that “0” are sought after, now, with my plots, is it still just “0”?

Cheers,
Raymond

Frazer_Mann · January 27, 2024, 5:11am

Sorry about that. Thanks for removing the link.

Frazer_Mann · January 27, 2024, 5:14am

Hi Raymond,

Looks like i didnt update the L1 and L2 parameters, specifically the size of the matrix i was feeding into it. Ive updated it and i get different weightings. I still get one plot thats a bit odd, same as you when you ran 5, but its weights are completely different from the rest.

I believe the following quote is the last part i havent grasped fully yet.

I believe it still is but the table now have No and No on both rows for Layer 1 Unit 2, hence why that’s not being included?

Hope you’re having a great weekend.

Rgds

Fraz

rmwkwok · January 28, 2024, 4:24am

Can you explain, for the follow unit, why is “0” region being sought after? Or would you see it differently?

given that you have said that

Cheers,
Raymond

rmwkwok · January 28, 2024, 10:08am

Hello Fraz @Frazer_Mann,

I believe the lab’s description is the core reason for you to make the statement that “0” is being sought after.

The lab’s description is for that particular example. Making that description based on the plots is, just like me using the table, a kind of summary of an inspection of one resulting neural network.

Inspection result can’t always be induced as rule.

In the above unit, clearly, undoubtedly, “1”-region covers the good-roast. Not “0”.

This unit completely breaks the statement that

“0” region are the sought after areas for our good roasts.

Fraz, we can make simple statement out of some observation, but there is no guarantee that the statement is a rule. If it is not a rule, then it cannot be applied everywhere. If it is not a rule, then we need to make the same inspection on a new model before we can tell whether it applies to the model.

As a learner and as a explainer, we try to interpret our model, and so we make visualization (plots and tables) and make statements from the visualization. However, we can’t forget that those visuals are based on just one model, and therefore, attempts of generalizing those statements can be very dangerous.

From our conversation, I see that you are the kind of learner who are eager to take action, to think and to generalize idea , and some of your doings were even completely out of my anticipation. As a very general suggestion, I recommend you to inspect more neural networks, and I hope you will see that although inspection might not give you any new rules, it might give you some understanding of your model, perhaps ideas of how to improve it, and how training works.

If you are a beginner learner, I suggest you not to take anything in the lab as generalizable rule, but take the activities in lab as how we can inspect something. Learn how to inspect, and be prepared to see something completely different from what the lab’s examples show you.

Inpsection is the skill you can take away and use in your future career. Lab’s results are not.

After you are better at inspecting, and when you build your own model, you might start to appreciate how the inspection skill helps your work.

In fact, your speculation of whether the model “got caught in a local minima and [questioned whether] further epochs would have helped kick it out of that?” is a very nice result of this exchange. Your inspection challenged some assumption about the status of the model and opened the door to improvement.

Cheers,
Raymond

Frazer_Mann · January 31, 2024, 1:25pm

Since the magnitude and sign are +ive, I would assume its looking to keep this region rather than discard the 1 region. All others in the “Sigmoid 1” table have “Yes” as a result so it indicates that the “1” region should be discarded.

This is based on rereading your first post, specifically:

And one of Toms replies:

Yes i am still pretty new to this, and I’m more interested in actually understanding what’s going on than smashing through the labs, so i have really enjoyed this conversation with you, its definitely helped stretch my understanding.

I think this is where i had gone wrong. I had tried to generalise it by saying all blue regions were bad but then was puzzled as to why you also (layer 1, unit 2) have a plot with the blue region over the good roasts and hadn’t acknowledged the magnitude and sign the NN had assigned to the 1 region which your table highlighted.

Thanks again for all your help Looking forward to our next discussion.

Topic		Replies	Views
Training a Simple NN on the Coffee Roasting Data Advanced Learning Algorithms week-module-2	4	339	November 10, 2023
Confusion on final graphs in coffee roasting lab Advanced Learning Algorithms week-module-1	3	310	November 27, 2023
Neural networks are a totally different way of thinking about data Advanced Learning Algorithms week-module-1	4	619	July 7, 2022
Weight count on C2_W1_Lab02_CoffeeRoasting_TF lab Advanced Learning Algorithms week-module-1	1	34	September 24, 2024
C2_W1_Lab02_CoffeeRoasting_TF layer functions Advanced Learning Algorithms week-module-1	12	589	October 26, 2024

C2_W1_Lab02_CoffeeRoasting_TF - Why does adding additional neurons result in weird plots?

Related topics