I tried two other type of normalisation & standardisation apart from the one defined in the assignment workbook. Please refer the below output for all three below
-
Default given in the assignment - i.e., divide each channel value by 255
a) cost output
b) Learning rate vs Cost function
-
Per channel considering all data for that channel to Normalize and standardize
a) cost output
b) Cost Vs Leanring rate
-
Per image per channel-wise normalization and standardization
a) Cost function
b) Cost vs Learning rate
As we see cost improves much better and quickly from 1 to 2 to 3, but accuracy doesn’t improve or degrades. Can we conclude the applied normalisation & standardisation is not doing any good as test accuracy is not improving and also on top of it the model is getting more overfitting as train accuracy reached to 100%.
Also please help me validate by normalisation and standardisation is applied correctly or not before flattening? And please suggest if there is better way of doing it.
- Before fattening the array normalise & standardisation the data channel-wise considering one at a time
train_channel_mean = []
train_channel_std = []
test_channel_mean = []
test_channel_std = []
for i in range(train_set_x_orig.shape[3]):
train_channel_mean.append(np.mean(train_set_x_orig[:,:,:,i]))
train_channel_std.append(np.std(train_set_x_orig[:,:,:,i]))
test_channel_mean.append(np.mean(test_set_x_orig[:,:,:,i]))
test_channel_std.append(np.std(test_set_x_orig[:,:,:,i]))
train_channel_mean = np.array(train_channel_mean).reshape(1,1,1,train_set_x_orig.shape[3])
train_channel_std = np.array(train_channel_std).reshape(1,1,1,train_set_x_orig.shape[3])
test_channel_mean = np.array(test_channel_mean).reshape(1,1,1,test_set_x_orig.shape[3])
test_channel_std = np.array(test_channel_std).reshape(1,1,1,test_set_x_orig.shape[3])
# normalize and standardize the data
train_set_x_orig = (train_set_x_orig - train_channel_mean) / train_channel_std
test_set_x_orig = (test_set_x_orig - test_channel_mean) / test_channel_std
# print(train_set_x_orig[0,:,:,0].shape)
# print(train_channel_mean.shape)
# print(train_set_x_orig.shape)
- per image per channel-wise normalisation and standardisation
train_x_channel_mean = []
train_x_channel_std = []
test_x_channel_mean = []
test_x_channel_std = []
for train_image_index in range(m_train):
train_x_channel_mean.append([])
train_x_channel_std.append([])
for i in range(train_set_x_orig.shape[3]):
train_x_channel_mean[-1].append(np.mean(train_set_x_orig[train_image_index:,:,:,i]))
train_x_channel_std[-1].append(np.std(train_set_x_orig[train_image_index,:,:,i]))
for test_image_index in range(m_test):
test_x_channel_mean.append([])
test_x_channel_std.append([])
for i in range(test_set_x_orig.shape[3]):
test_x_channel_mean[-1].append(np.mean(test_set_x_orig[test_image_index,:,:,i]))
test_x_channel_std[-1].append(np.std(test_set_x_orig[test_image_index,:,:,i]))
# print(train_x_channel_mean[0])
train_x_channel_mean = np.array(train_x_channel_mean).reshape(m_train,1,1,train_set_x_orig.shape[3])
train_x_channel_std = np.array(train_x_channel_std).reshape(m_train,1,1,train_set_x_orig.shape[3])
test_x_channel_mean = np.array(test_x_channel_mean).reshape(m_test,1,1,test_set_x_orig.shape[3])
test_x_channel_std = np.array(test_x_channel_std).reshape(m_test,1,1,test_set_x_orig.shape[3])
train_set_x_orig = (train_set_x_orig - train_x_channel_mean) / train_x_channel_std
test_set_x_orig = (test_set_x_orig - test_x_channel_mean) / test_x_channel_std
# print(train_set_x_orig[0,:,:,:])
# print(train_x_channel_mean[0,:,:,0].shape)
Apologies for such a long question







