Some suggestions to improve the exercise “Logistic Regression with a Neural Network mindset”
Overview of the Problem set
This overlaps “Exercise 1”, but we could have a describe() procedure to introduce the data handed to us, like this:
# Loading the data (cat/non-cat)
# load_dataset() is defined in the task-specific "lr_utils.py"
train_set_x_orig, train_set_y, test_set_x_orig, test_set_y, classes = load_dataset()
def describe(obj,name):
if type(obj) == np.ndarray :
print(f"{name} is a {type(obj)} of shape {obj.shape}")
else:
print(f"{name} is a {type(obj)}, that's all I know")
describe(train_set_x_orig,"train_set_x_orig")
describe(train_set_y,"train_set_y")
describe(test_set_x_orig,"test_set_x_orig")
describe(test_set_y,"test_set_y")
Output:
train_set_x_orig is a <class 'numpy.ndarray'> of shape (209, 64, 64, 3)
etc.
Even though the ndarry
is “just a rectangular, n-dimensional bunch of numbers”
one can see the implied hierarchy of objects (and idea that does not exist in the mathematical object itself): image → row → pixel (col) → colors (RGB)
although it is not immediate clear whether the row or column is higher up in the hierarchy.
In “Example of a picture”, the code should be collected into a single procedure (always do procedures, even if the language doesn’t demand it). Also, one should use string interpolation in print (this applies to every print statement on the page, really) and the “class name retrieval” should be made more explicit (I was reflecting for some time what that might be about…) Here we go;
def retrieve(index):
image = train_set_x_orig[index]
ylabel_arr = train_set_y[:, index] # this is an ndarray of shape (1,)
ylabel_int = np.squeeze(ylabel_arr) # this is an ndarray of shape (), in this case an int
class_name = classes[ylabel_int].decode('utf-8') # lookup name in the list of classes: 'cat'/'non-cat'
plt.imshow(image)
print (f"y = {ylabel_int}, it's a '{class_name}' picture.")
retrieve(25) # is cat
# retrieve(20) # is non-cat
Now the student can easily retrieve images.
In the next exercise, we should again prefer the power of string interpolation to make code more readable:
print (f"Number of training examples: m_train = {m_train}")
print (f"Number of testing examples: m_test = {m_test}")
print (f"Height/Width of each image: num_px = {num_px}")
print (f"Each image is of size: ({num_px}, {num_px}, 3)")
print (f"train_set_x shape: {train_set_x_orig.shape}")
print (f"train_set_y shape: {train_set_y.shape}")
print (f"test_set_x shape: {test_set_x_orig.shape}")
print (f"test_set_y shape: {test_set_y.shape}")
In Exercise 2, I was puzzled by the reshape
instruction. But then I found out that:
A trick when you want to flatten a matrix X of shape (a,b,c,d) to a matrix X_flatten of shape (b∗c∗d, a) is to use:
X_flatten = X.reshape(X.shape[0], -1).T
due to the way Python processes tuples in calls, this is the same as
X_flatten = X.reshape((X.shape[0], -1).T
which tells reshape to build a matrix with size
X.shape[0]
along the first axis, and to figure out a fitting size to accept all cells for the second axis.
So maybe the remarks in the code should be extended to include the above.
Again in the exercise, prefer string interpolation for readability:
print (f"train_set_x_flatten shape: {train_set_x_flatten.shape}")
print (f"train_set_y shape: {train_set_y.shape}")
print (f"test_set_x_flatten shape: {test_set_x_flatten.shape}")
print (f"test_set_y shape: {test_set_y.shape}")
In exercise 4, “Building the parts of our algorithm”, the pseudo-code given seems confusing. It seems to be the “non-batch” version, whereas we have up to now, worked on the whole batch of training examples in one weight-update step, computing the error rather than the loss and computing the gradient of the error relative to the weight parameters rather than the gradient of the loss relative to the weight parameters.
Below that, printing the sigmoid can be done more nicely with
print (f"sigmoid([0, 2]) = {sigmoid(np.array([0,2])}")
In fact, we can flexibilize:
def apply_sigmoid(z):
if isinstance(z,(list,tuple)):
nz = np.array(z)
elif isinstance(z,(int,float)):
nz = np.array([z])
elif type(z) == np.ndarray:
nz = z
else:
print (f"Can't handle {type(z)}")
return
print (f"sigmoid({nz}) = {sigmoid(nz)}")
apply_sigmoid(0) # scalar
apply_sigmoid([0,1]) # list
apply_sigmoid((0,2)) # tuple
apply_sigmoid(np.array([0,2])) # numpy array of int
apply_sigmoid(np.array([0.5,0,2.0])) # numpy array of float
Then:
sigmoid([0]) = [0.5]
sigmoid([0 1]) = [0.5 0.73105858]
sigmoid([0 2]) = [0.5 0.88079708]
sigmoid([0 2]) = [0.5 0.88079708]
sigmoid([0.5 0. 2. ]) = [0.62245933 0.5 0.88079708]
There are a few other places where string interpolation can simplify code.
Before “predict”, I suggest adding this note:
Always use
m[i, j]
for accessing elements in a NumPy array. Whilem[i][j]
works, it is less efficient and less idiomatic in NumPy.
because I was unsure about the access notation.
In “predict”, there is a line
w = w.reshape(X.shape[0], 1)
which seems useless to me as w
already has that shape.
Note that for filling in predict
, one can point the student to the Numpy floor()
function. The logistic function is well suited for that approach.
Finally, we are told to display the picture of a single failed entry in the test set, in a hard-to-understand manner. This being completely unfun, I suggest displaying all of the failed tests in one setting. Here is a reasonable attempt (empirical) which will display the failures in a 4 x 4 grid:
def find_failed_tests(testset_Y_predicted,testset_Y_expected):
failed_tests = [] # collect indexes
assert testset_Y_predicted.shape[1] == testset_Y_expected.shape[1], \
"predicted and expected test result counts must be equal"
num_tests = testset_Y_predicted.shape[1]
for testset_i in range(num_tests):
class_predicted_num = int(testset_Y_predicted[0,testset_i])
class_expected_num = int(testset_Y_expected[0,testset_i])
# print(f"Predicted class: {class_predicted_num}, Expected class: {class_expected_num}")
if class_predicted_num != class_expected_num:
# store the index and outcome of the failed prediction
class_predicted_text = classes[class_predicted_num].decode("utf-8")
class_expected_text = classes[class_expected_num].decode("utf-8")
# print(f"Predicted class: {class_predicted_text}, Expected class: {class_expected_text}")
failed_tests.append([testset_i,class_predicted_text,class_expected_text])
return failed_tests;
def display_in_grid(failed_tests, test_set_x):
width = 4
height = int((len(failed_tests) + (width-1))/width)
# matplotlib.pyplot.figure says "figsize is the size of a subplot in inches"
# for some reason, 10 x 10 inch seems best, entirely empirical!
fig, axes = plt.subplots(width, height, figsize=(10,10))
for i in range(height):
for j in range(width):
index = j + i*width
if (index < len(failed_tests)):
testset_i = failed_tests[index][0]
pic_data = test_set_x[:, testset_i].reshape((num_px, num_px, 3))
class_predicted_text = failed_tests[index][1]
class_expected_text = failed_tests[index][2]
axes[i,j].imshow(pic_data)
axes[i,j].set_title(f"Test pic {testset_i}.\nPredicted '{class_predicted_text}'\nExpected '{class_expected_text}'")
axes[i,j].axis('off')
else:
axes[i,j].axis('off')
plt.tight_layout()
plt.show()
testset_Y_predicted = logistic_regression_model["Y_prediction_test"]
testset_Y_expected = test_set_y
failed_tests = find_failed_tests(testset_Y_predicted,testset_Y_expected)
print(f"Found {len(failed_tests)} failed tests")
display_in_grid(failed_tests, test_set_x)
Aaannn… that’s about it. Thank you for reading