Clarification about the formula for sigma

Hi!

In the “Normalizing inputs” videos of C2W1, Prof. Ng gives the formula for σ as image

I just want to clarify that in this formula, the input vector X represents the input vector wherein the mean was subtracted from it, and not the original input vector X?
As in, is the second step dependent on the result of the first one or not?

Another clarification, when I read a little bit about normalization from external sources, I found that a website sites the formula of σ as image

So I wanted to ask, which formula is correct and should be used while making models of our own? The one Prof. Ng tells us about, or the second one I dug out?

Also, on a side note, can you please tell me how to embed HTML or Markdown or LaTex in these questions?

Thanks for your question.

Regarding the question whether to divide by „n“ or „n-1“. Note: For large „n“ both expressions converge to the same result.

Since standard deviation is meaningful to describe the „spread“ of the distribution, personally I think n-1 makes a lot of sense. This implies that for n=1 the division is (not defined and) not meaningful. This makes total sense since for one single sample only there is no distribution.

https://digitalcommons.unl.edu/cgi/viewcontent.cgi?article=1008&context=imseteach

You might also want to take a look at Bessel’s correction: Bessel's correction - Wikipedia

1 Like

Please also note that in Prof. Ng‘s video, he talks about variance at 03:10: https://www.coursera.org/lecture/deep-neural-network/normalizing-inputs-lXv6U

Since variance is standard deviation squared, I think it’s rather „sigma squared“ in the first formula of your post.

Best regards
Christian

1 Like

@Utkarsh2707

I suppose there is a guidance whether you should use “n” or “n-1” in statistics.
If we want to calculate standard deviation for “samples”, then, “n” is used. If those samples come from a large “population”, and we want to calculate standard deviation of “population”, we use “n-1”, since we do not have all data. And, this standard deviation is called “unbiased standard deviation”.

1 Like

Also, on a side note, can you please tell me how to embed HTML or Markdown or LaTex in these questions?

It is the same way to include Latex into other markdowns. Starting from single $ with ending $, you can insert in-line LaTex text like this " \sigma =", which is actually $\sigma =$. If you want to insert multi-line LaTex text, then, start with $$ with ending $$.

1 Like

Thank you @Christian_Simonis and @anon57530071 for your answers.
Really appreciate them!

1 Like