On Course 2 Week 3 Lab exercise: Practice Lab: Advice for Applying Machine Learning

Good day,

I am curious about two things from this exercise. First, how is the decision boundary plotted in section 4 of the exercise? Specifically the plot produced by the plt_nn function. It seems that we first use the argmax function to make discrete predictions such as 0, 1, 2, 3, 4, 5. The prediction is applied to all points in the meshgrid. I went in to the code file and back traced the plot_cat_decision_boundary function. In it levels were not specified in the contour command. I wonder in that case what contour is doing to produce the boundaries. The code is provided below.

Second, for kernel regularization in a neural network, say I impose \lambda_1 and \lambda_2 on layer 1 and 2. with l2 class. Is this saying the loss function is now
\text{original loss function} + \frac{\lambda_1}{2m}\sum_{i}||\mathbf{w}_i^{[1]}||^2+\frac{\lambda_2}{2m}\sum_j||\mathbf{w}^{[2]}_j||^2, where ||-|| denotes the L_2 norm of a vector?

Start of code:

def plot_cat_decision_boundary(ax, X,predict , class_labels=None, legend=False, vector=True, color='g', lw = 1):

    # create a mesh to points to plot
    pad = 0.5
    x_min, x_max = X[:, 0].min() - pad, X[:, 0].max() + pad
    y_min, y_max = X[:, 1].min() - pad, X[:, 1].max() + pad
    h = max(x_max-x_min, y_max-y_min)/200
    xx, yy = np.meshgrid(np.arange(x_min, x_max, h),
                         np.arange(y_min, y_max, h))
    points = np.c_[xx.ravel(), yy.ravel()]
    #print("points", points.shape)
    #make predictions for each point in mesh
    if vector:
        Z = predict(points)
    else:
        Z = np.zeros((len(points),))
        for i in range(len(points)):
            Z[i] = predict(points[i].reshape(1,2))
    Z = Z.reshape(xx.shape)

    #contour plot highlights boundaries between values - classes in this case
    ax.contour(xx, yy, Z, colors=color, linewidths=lw) 
    ax.axis('tight')

The matplotlib.pyplot.contour() function is documented here:
https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.contour.html

Typically we don’t use a different regularization value on each layer. We don’t really have an effective way to optimize them separately.

Hello chi-yu @u5470152,

Here is the doc for contour that you can find explanations about the input arguments to it. To see what it actually does (which is really just plotting some contour lines), the easiest way is to “comment out” that line and see what disappears. To speed up your exmaination work, you might copy that function together with the other function that uses it to your notebook, and then do the “commenting out” or other edits from there. In this case, the jupyter notebook will reference to your copied functions instead of the ones in the ‘.py’ scripts. I think then the remaining element is how Z was computed, but let’s see if you really have questions about it first.

Yes.

Cheers,
Raymond

Thanks to @TMosh and @rmwkwok for your responses. I guess I need to more specific about my question on the contour. The original documentation does not seem mention the default levels when no levels were supplied.

Here is an example I explored. And thanks @rmwkwok or @TMosh for correcting my code block in the original post. The following function assigns the quadrant number 0,1,2,3 to all points in the square [1,10]\times[1,10].

    def quadrant(x_coord, y_coord):
    if (x_coord >=5.0 and y_coord >=5.0):
        return 0
    elif (x_coord >=5.0 and y_coord <= 5.0):
        return 1
    elif (x_coord <=5.0 and y_coord <= 5.0):
        return 2
    else:
        return 3

Now I plot the contour

x = np.linspace(1,10)
y = np.linspace(1,10)
x_coord, y_coord = np.meshgrid(x,y)
Z = np.zeros((50,50))
for i in range(x_coord.shape[0]):
    for j in range(x_coord.shape[1]):
        Z[i,j] = quadrant(x_coord[i,j], y_coord[i,j])

fig, ax = plt.subplots()
ax.contour(x_coord, y_coord, Z, colors = 'red')

The plot I got was
contour.

What I did not understand, is that all points in the first quadrant has constant Z-value 0. Instead of painting the entire quadrant, somehow the program is able to figure out the boundaries of these quadrants. I also tried setting a specific contour , using

fig, ax = plt.subplots()
ax.contour(x_coord, y_coord, Z, levels = [1], colors = 'red')
plt.savefig("contour_special_value.png", bbox_inches = 'tight')

The plot I got was:
contour_special_value, which is even more confusing, as none of the upper half line segment should be plotted.

Maybe there are some errors in my code that prevents me from interpreting the behavior of contour plots.

Let me point out if I use contourf without specifying any levels, it does produce the region of four different quadrants.

I hope this clarifies my questions.

1 Like

Then it comes to matplotlib contour’s behavior which I am not sure about. Its doc didn’t explain how it chooses contour, so I am afriad without digging into the code, we won’t know how it works exactly. Without specifying levels, it managed to find some contours (be they satisfactory ones or not), so there must be some algorithms behind, and perhaps your example has tested it limits…

I found and read the code.

The reason they don’t clearly give the default contour level is that, well, it’s horribly complicated and there is a lot of logic to decide what to draw based on the data set and all the other parameters that may have been given. And it will automatically suppress some of the contour lines if there’s no data in those regions.

So there’s no simple answer.

Here’s what I think the plot utilities are doing.

  • The categories are in the ‘Z’ axis.
  • It looks at the number of categories, call that ‘N’.
  • It draws ‘N - 1’ contours, equally spaced between each pair of adjacent categories.

So if you had categories 1, 2, and 3, there will be lines drawn at 1.5 and 2.5.

It creates a spaced grid of all of the x and y values, so it can interpolate to find the midpoint between adjacent categories.

Or something like that.

Thanks for the effort. This is a new feature of the contour package I am aware of now.

Hi, I have a quick question related to regularization: why we don’t apply regularization to the output layer?

You can, but in general it isn’t of much benefit.

Sorry, I read your question incorrectly. I thought you asked about “normalization”.

The cost has two components. One is the error in the predictions, and the other is the magnitude of the weight matrices.

In a simple NN with one hidden layer, there are two weight matrices:

  • one that connects the input to the hidden layer,
  • and one that connects the hidden layer to the output predictions.

Both of them are included in the regularized part of the cost.

Can you say more what you mean exactly about “apply regularization to the output layer”?

======

By the way, your question is off-topic for this thread. This thread is about decision boundaries. It’s a better idea if you start a new thread for each new topic.