What actually is the plot function here doing? plt_layer(X,Y.reshape(-1,),W1,b1,norm_l)
1.b. Is it actually just plotting the Y values from the training data; and separately the W1 and b1 are used to determine the sigmoid (shaded) areas?
1.c. None of the data displayed appears to be normalized; is that interpretation correct? (I noticed a norm_l parm is passed through

Is the vertical bar on the right corresopnds to the sigmoid values for each neuron?
2.b. Is the following interpretation correct: (say the input data had a low duration): unit 0 outputs a sigmoid value close to 1 (based on the dark blue shading); when the activation for unit 1 is output to layer 2 (as inputs), it will have a bigger weight/“influence” on the sigmoid in layer 2 than if it was a value close to 0?

For the network decision graph I don’t see anything different (other than the shaded in areas for the sigmoid output missing); is there something I am missing?

Following up, there’s also a CoffeeRoastingNumPy lab later on. On the final graph, I’m still confused how the red Xs vs blue shaded areas are generated

The statement mentions that left side of the graph is network probability of the final layer outputs showing which is good roast and bad roast without pre-defined decision boundary where as the right graph with yellow X and O has been given defined decision threshold defining the predicted good and bad roast.

Kindly notice the top 3 graphs which is mentioning about layer 1, all 3 are related to layer 1 and the network did a statistical analysis with all the 3 probabilities where once the temperature is low with unit 0, temperature is high with unit 1 and time/temperature has a bad combination with highest unit values of 2.

So when the network ploted a graph of this logistic analysis, the inputs here was the output of this first layer creating the left side of graph without any predefined threshold but the right graph yellow X and O had a decision threshold, concluwive of what statistical probability of the network showed.

We will plot the output of each node for all values of the inputs (duration,temp). Each unit is a logistic function whose output can range from zero to one. The shading in the graph represents the output value.

So it is computing Y values from the each nodes and each unit is a logistic function whose output can range from zero to one. So no it is not calculating separately from training data and W1 and b1. it is computing with logistic equation of the each unit which is part of w1 and b1.

the below image will give you the ideal about norm_l and why it is done

notice the line mentions the reason to normalise data helps to better fitting of weight to the model algorithm, hence normalising data calculate the means and variance of dataset. usually useful in the plotting statistical regression models.

no the vertical bar corresponds to the y value range from 0 to 1, where 0.5 is the decision threshold in the network predictive analysis.

This I explained in the previous comment about difference in two graphs. Both graphs are actually same but one has been graphed with defined threshold.