Hello, I am quite confused about the answer to the quantification of how much the features vary in the assignment. In the function covariance_matrix_from_examples:

def covariance_matrix_from_examples(examples):
“”"
Helper function for get_top_covariances to calculate a covariance matrix.
Parameter: examples: a list of steps corresponding to samples of shape (2 * grad_steps, n_images, n_features)
Returns: the (n_features, n_features) covariance matrix from the examples
“”"
# Hint: np.cov will be useful here - note the rowvar argument!
### START CODE HERE ###
return np.cov(examples.reshape(-1, examples.shape[2]), rowvar=False)
### END CODE HERE ###

Why would the answer be resshaping all of the gradient steps? Wouldn’t we just use the last step or the final result after 30 gradient steps? I think reshaping this with -1 doesn’t make sense because we are combining the dimension with grad_steps and n_images.

The general idea is that for the images we created by making a series of slight changes to a feature, we get a list of how much each of the classifiers changed from their original value. This function, covariance_matrix_from_examples(), takes that list of classifier changes and creates a covariance matrix showing how the classifiers vary compared to each other. get_top_covariances() can then use this covariance matrix to pick out the top classifiers that most closely sync with our target feature.

The description for the parameter examples in the covariance_matrix_from_examples() comment seems misleading, and I’m guessing that’s what’s confusing you. Instead of saying it’s a “list of steps” it seems more accurate to say that it is a list of classification changes. The shape is as the comment says - shape(2 * grad_steps, n_images, n_features) where 2 * grad_steps is the number of images we generated for each of the n_images starting images
(grad_steps of them in the additive direction and another grad_steps of them in the subtractive direction).