Confused over number of iterations and number of weight updates

FYI, that’s really an inefficient way to look at what should be considered a vector algebra task. You don’t need to compute each weight separately, you can do that all in one line of code. Then you don’t need that for-loop at all.

This sort of “vectorization” will be covered later in the course.

i don’t know if you have taken Deep Learning specialisation. Machine learning doesn’t just involves computing weight. it involves neural network where each neuron represents a unit where all these parameters are passed through in a different ways in model training.

You will understand these part of deep learning more in DLS specialisation.

weight is just not about update of gradient weight, but also to achieve minimised cost function which hold significance in relative to dataset, its features and other methods.

as mentor mentioned you, you will have this understanding in further courses.

So for K iterations of the for-loop, there will be 4 * K weight parameter updates to w_0, w_1, w_2 and w_3 in total and K bias b updates?

Correct?

I am curious why every one of your messages asks the reader to agree with your statement?

It depends on what you define as an ‘update’.

The way that scikit-learn seems to count “weight updates”, they only are counting the update cycles, regardless of the number of weights.

Notice that in my reply here from 13 hours ago, the number of weights did not matter, in the way the scikit-learn reports them.

1 Like

Again, I want to re-state that I fear you are focusing on a minor point that is not significant in the larger scope of the machine learning introduction course.

I think that counting the weight updates is not worthy of this much energy expenditure.

Please can you say if you agree with my last conclusion?

Thank you.

I have nothing new to add to this thread. Perhaps another mentor or community member can take up the discussion.

After reading through the comments I share the sentiment as @TMosh. It’s sufficient to know the big O notatation of the algorithm (in some rare cases it’s good to know the constants that precede the mathematical terms like polynomials, exponentials, logarithms, …), but not to the extent of insignificant constants. It’s good enough to represent the complexity in terms of big O of the number of parameters, number of samples and number of iterations.

Thank you for your reply.

However, I still do not understand why the number of weight updates is over 12,000 using scikit-learn but the number of iterations is 124 and the number of weight parameters is 4. There is only ONE weight update per iteration from the Python code which implements the following pseudo-code for just one weight parameter;

w_0 = w_0 - alpha * gradient_function(w, X, b, Y)

So there are only 4 * 124 weight updates in total or 496 weight updates.

Hi @ai_is_cool
I think I understand what’s confusing you! Pay attention! The training set is not the one from the slides. Try printing x_train and you’ll understand. Happy learning!

What slides? There are only video presentations.

You don’t understand the question I posted.

There are as many weight updates as there are iterations in the epoch multiplied by the number of weights.

I think you are confused between weight updates and weight-feature products in the evaluation of the partial derivatives of the cost function.

As you say.
I’m able to obtain the result of the number of weight updates by myself without using sgdr.t_ of scikit!
Anyway, I downloaded the slides too :wink: and anyway during the video there are only slides…

I’m not confused trust me!

Have a good day!

I don’t know what you mean by “…slides…”.

There is only a video presentation in the MLS Course 1.

Do you know what a weight update operation at this point in the course?

try to search for “Download The Course Slides” with browser

The thread query isn’t this! if you need an answer to this query I think you need to open another Thread!

Prof. Ng does not instruct his students to download “…slides…” in his course so I will follow his course instruction as presented on the MLS.

You didn’t answer my question about weight update operations.

I’m asking you this because it is pertinent to having a correct answer from you for the initial question.

I don’t think you understand what a weight-update operation is.

I’m trying to say you that in this thread you have already the answer and it’s not mine!

I have the correct answer from my own observations of the code and algorithm not from your responses.

There is confusion however in Prof. Ng’s code when it reports the number of weight updates incorrectly.

for the course you only need to understand what is the number of iterations IMHO