In week one the ‘def multivariate_gaussian(X, mu, var)’ is used to calculate the probability values. One of the arguments is ‘var’ which in our case is in a 1-d vector format. So this code converts it to a shape of (n,n) so we can use it in the determinant calculation.

Changing to a (n,n) matrix

if var.ndim == 1:
var = np.diag(var)

Determination code which is taken from the longer line of code

‘p.linalg.det(var)**(-0.5)’

However ‘var’ in this (n,n) matrix is missing the cov(x,y) values which is the top right and bottom left of the var matrix.

So they have the default value of 0.

So the code ‘p.linalg.det(var)**(-0.5)’ will not calculate the correct determinant value and hence all the probability values will be wrong?

Should it not be convariance=np.cov(var, rowvar=False) then p.linalg.det(convariance)**(-0.5)’ ?

I don’t work for DLAI (mentors are community volunteers), but I would guess that how quickly issues are addressed depends on their severity and the workload of the DLAI staff.

As your report is the only one in several years of this course being active, I’d guess it is not a high priority.

Not sure what you mean by that. As Tom says, the mentors are just fellow students. We are volunteers, we do not work for DLAI and we don’t get paid to do this. We did not create any of the course materials and we cannot change them.

Do you mean that you want to talk to someone on the Course Staff or who actually works for DeepLearning.AI?

Hmm. Confused myself. Let me work through the process as maybe I am barking up the wrong tree.

I am doing the Coursea Machine Learning Specialization course.

I have run into a problem with the course notes and I want to resolve that issue.

Part of the pre-course material pointed me here for any issues with the course material. Is this were I go for help with relation to the ‘Coursea Machine Learning Specialization course.’?

I am not a mentor for that course but Tom sent me the source code for that function. It’s a given function in the utility file for that assignment.

They explain the meaning of the arguments in the “docstring” of the function. The third argument var gives the covariance matrix, but it can be given in two different forms:

If it is given in vector form, then they create the covariance matrix by using var as the variance in each dimension of the distribution and creating a diagonal matrix with those values as the diagonal.

But the function will also accept var as an n x n matrix, in which case it simply uses it “as is”.

Yes, it is. But we are only the first line of defense. There is no guarantee that we can answer every possible question that can be formulated. We will try to call for backups in cases in which we can’t answer, but there is no guarantee that will happen in as timely a fashion as you might wish.

Thanks for the reply. I think communication is important in any course.

The people who run this course expect me to pay my money for the right to do this course. Note that this is a two way process as I then expect to be taught appropriately.

Any issues that arise I would expect, as I am paying, that such issues get resolved in a timely manner.

Those notes are way, way over my head BUT I managed to deduce that there are two covariance arguments. Diagonal and non-diagonal covariance. What you use depends on your data spread. The course notes used diagonal covariance whereas I used non-diagonal covariance. Something about whether is are relationship between the features or not. No relationships then you use diagonal covariance otherwise non-diagonal.

You can see in the notebook that they use the estimate_gaussian function to compute the arguments that they later pass to multivariate_gaussian. And what they do is compute the “elementwise” variance for each feature across all the samples in the input dataset. So apparently that “diagonal” version of covariance, which doesn’t account for interactions between features, is good enough for the purposes here. So the whole distribution is really being computed independently “per feature”. At least that’s my interpretation of how they are doing things.

The simplistic point just looking at the implementation of that specific multivariate_gaussian function is that they are not computing the covariance but just taking it as an input.