Deep Neural Network - Application

What happens inside the functions that runs the trained DNN model on the test and training sets? Why does it execute so quickly as opposed to when you are training the model?

Since evaluating the model does not involve any training, only the “forward propagation” is performed, and it only happens once per example.

2 Likes

You don’t have to wonder what happens: we’re just using the functions that we wrote in the previous Step by Step assignment. Take a look at the code in the predict function in the previous exercise. As Tom says, when you’re using the model to make a prediction, you just run forward propagation once although you can feed it a matrix of samples so that you get multiple predictions in that one pass of forward propagation.

We just wrote the code to do the training here in this assignment, so remember what it does in each training iteration:

  1. Forward propagation on all samples.
  2. Backward propagation on all samples.
  3. Apply the updates to the parameters.

So each training iteration is roughly 2 or 3 times as expensive as just running predict once. But then we repeat steps 1-3 2500 times. It’s the 2500 times (or however many training iterations you do) that is the costly part of training, of course.

1 Like

So when a model is ready for a real world application, would only one input (“example”) be fed into the model at one time or would a batch of inputs be fed into the model? Also, do you know what the typical hardware requirements are for one of these runs or “forward propagation” iterations?

Are models retrained when there usefulness in real-world applications somehow declines or if they need to be adjusted to be applied in a different environment or say the environment changes? Or is it more once you train a model, it is set and you release it for a particular use case?

I hope my questions are clear and make sense. I’m just curious about the practical applications of AI in the real world and trying to understand the concepts. Thank you for your help.

Either way - the number of examples doesn’t really matter. It depends on how many predictions you want to make.

1 Like

Yes.

If your training data is no longer valid, or of you obtain more training data, you would re-train the model.

1 Like

Thanks for getting back to me so quickly. I have several more questions.

Do engineers retrain models in real-time for certain applications, say for example, if new data comes in? or is this infeasible?
If a model is retrained on new data, is this called “personalization” in some use cases?
and finally, would it be infeasible for a model to be retrained if you only acquired a few more examples versus a large number of examples?

Exactly. You’re just invoking the predict function with an input matrix X, which has dimensions n_x x m, where n_x is the number of features in each sample and m is the number of samples. m can be 1 or any other number bigger than that. So the code is exactly the same either way and it just depends on what your goal is.

Albert, I hope you did as I suggested and went back and read the code for the predict function.

There are ways to do retraining “on the fly”, but I have not seen those covered in any of the courses here. Maybe the MLOPS specialization would discuss some of the topics there, but I have not taken that and it is currently offline to be revised.

Maybe we will get lucky and someone else who is familiar with some of those techniques will notice this thread and chime in.

But in general if you have a large initial training set and only a few more examples that don’t work well with the existing model, you can try doing the retraining with the augmented training set. What happens will depend on the numbers and how different the new examples are. I don’t know of a way to predict in a general whether that would make a difference or not. Like many things in ML, the answer is that you have to try it and see what happens.

I didn’t see a predict function when I went back and looked over the Building_your_Deep_Neural_Network_Step_by_Step lab.

I did see Figure 1 with the ‘predict’ at the end of the diagram, but couldn’t find the function.

Never mind. I found the predict function in the files in the Deep Neural Network-Application lab.

1 Like

In this assignment, the predictions are computed by L_model_forward().

You can see that in the doctext.

More typically there would be a separate “predict()” function.

2 Likes

Oh, sorry, you’re right: it’s just given to us in the Week 4 Application exercise. I was thinking of Week 2 (Logistic Regression) and Week 3 (Planar Data) where we wrote a predict function in both those cases. The actual function definitions are slightly different in the 3 cases, but the fundamental behavior is the same: it just runs forward propagation for whatever the model is and then converts the sigmoid probability outputs to 0 or 1.

@siral they don’t let me play with Enterprise grade toys yet, so I won’t proclaim yet to know all the ins and outs of how all this works, but ‘on the fly’ retraining is more commonly known as ‘online learning’ or ‘continual learning’.

Though much of the discussion is on online inferencing, Chip Huyen (a former student of Prof. Ng I believe) has a chapter on this (Ch 9 - Continual Learning and Test in Production) in her excellent text Designing Machine Learning Systems[Book].

In addition she has set up a general discussion Discord to go along with the text. There are some really smart people over there, but one in particular comes to mind, a user that goes by the handle ‘MattrixOperations’.

I think if you asked nicely and respectfully (i.e. not pestering), they would be able to answer/address all your related questions.

2 Likes