Confused about Deep Network

CourseraFan · April 30, 2022, 4:26am

Hey guys. So i’m a little bit confused about how Deep Network work. So basically after it does forward prop, compute cost, backward prop, update parameters and repeat that 4 step for num_iterations times in the training images, when it does the test, does it have to do these 4 steps again for num_iteration times? Or can someone maybe correct me? Thanks!

SainiAnkit · April 30, 2022, 4:56am

The network performs 4 steps during training to learn the patterns from the data.
During test time you perform the forward pass and compute cost/loss. You don’t perform the backward pass and hence no parameter updates.

CourseraFan · April 30, 2022, 4:58am

Oh so basically it predict using the pattern it learns?

akkefa · April 30, 2022, 4:59am

Let me explain to you by using the code.

Training_data = dataloader(file_path)
Test_data = dataloader(file_path)
model = loading_model().

for single_epoch in total_number_of_epochs:
 
    for single_batch in  batches(Training_data):
         single_batch feed to model()
         forward prop, 
         compute cost,
         backward prop,
         update parameters
         check_training_accuracy()
         **check_test_accuracy()**
         model.save()

On every batch or epoch, we are testing the model on test datasets

CourseraFan · April 30, 2022, 5:10am

Oh, but if the Network was trying to predict the answer, does the Network follows the pattern it learn from based what it learn from training sets?

akkefa · April 30, 2022, 5:29am

Yes. However, model architecture plays an important role in it. In order to learn from training data in a generalized way, choosing the right deep learning architecture is very important. This will help it perform better with untrained data. E.g test data.

Christian_Simonis · April 30, 2022, 10:38am

That‘s right. You can steer the training of your network with hyperparameters e.g. to make sure the model complexity is appropriate for your problem (e.g. regression task) and your data. Possibilities include here L2 regularization or dropout.

More details will follow in the 2nd course of the deep learning specialisation along your learning journey:

Best regards
Christian

CourseraFan · April 30, 2022, 3:36pm

Thanks! By the way, like the last answer of mine, does predicting using the pattern, with forward propagation need to repeat num_iterations?

Christian_Simonis · April 30, 2022, 4:59pm

Hi there,

here is my take:

you can to the prediction in a feed forward style just without iteration in case you are only interested in the result of your input with this very trained net (evaluating the trained weights)
in your case however when you want to evaluate the performance during epochs, it makes sense to do this in an iterative way along the training to monitor how the training is doing. I think @akkefa post is quite nice on this note

Please let me know it this answers your question.

CourseraFan · April 30, 2022, 5:10pm

Oh, so the programming assignment that I’ve done, which one of these two option that you’ve given is implemented for the prediction? Oh and also one more thing, does this kind of topic will discussed in upcoming lectures? Thanks!

Christian_Simonis · April 30, 2022, 5:39pm

Yeah, evaluating is done based on the prediction. The good thing is: you have a label as ground truth which helps you in evaluating how good the prediction actually is and adapt your weights based on this.

Of course! In the specialisation you find many more applications and use cases.
Feel free to take a look here if you want to dig deeper.

CourseraFan · May 1, 2022, 2:11am

Thank you for the answer!

Daniel_Dick · May 6, 2022, 2:59am

Without going into the lion’s share of the work which would be the exploratory analysis, talking to stakeholders to get clear on what goals will make or break your success in the project, identifying and correcting missing and bad data, examining correlations, wrangling the data, deciding on a model, the training and inference might be oversimplified by thinking of it like this:

1–load the data
2–fix the data
3–wrangle the data into the right numeric form
4–normalize it, do feature engineering, PCA, check correlations.
etc., as desired.
5–set up your model.
6–split your data into training, validation, and test groups.
7–“fit” the training data to your model. Check the performance
during training.
8–“predict” testing your validation set against the model.
(This lets you see how your model performs with data it hasn’t
seen before)
9–perhaps adjust your hyperparameters and start over at 7.
Keep doing this until your model’s performance in validation is
acceptable.
10–“predict” the “test” data and check what the final performance
of your model is to make sure you did not overfit the validation
set.

Your preferred model will not always be a feed-forward neural net. You may find most to be simple linear or logistic regressions or decision trees or random forests. If you’re going the deep learning route, you may be in for a long wait that can be reduced by finding a network already “learned” and using transfer learning locking down the first layers and learning only the last new layers. You may need an RNN of some type, an LSTM or GRU. But in almost any of these cases, you’ll still end up “learning” your training data (regularizing it perhaps with L1 or L2 or dropout.

Now if you have all your data for learning and don’t need to do it one record at a time as it comes in, you may be able to use batch gradient descent or mini-batch. Most of the time you might prefer mini-batch, perhaps making sure each batch will fit within a GPU just to make things faster. Otherwise, if you have to learn online, you may use one record at a time learning or “stochastic” also called “online” gradient descent.

Perhaps others more experienced will be able to correct me where I have missed something or gotten something wrong, but I believe that is sort of the general idea and I hope it can be helpful or that I’ll be forgiven if I am wrong. I will be thankful if some here could help make sure I have not led anyone astray by mistake.

OmerA · May 9, 2022, 6:29am

I never explained this way but I’ll try to keep it as simple as possible:

The forward prop is the “prediction” phase, you insert data into a network and get the prediction as what comes out from the far end
when you want to train your network - you need the rest of the phases -
you use the output you got to calculate how far off your network is by whatever loss function that you choose to use and the ground truth (that you use only in the training phase!)
and then you backprop which corrects the weights to hopefully fit better for the next batch (or iteration if you prefer this term)

Rashmi · May 9, 2022, 12:27pm

Yes Omer,

Welcome to the community.

You are right, the backprop checks the losses you have got during the forward pass (by correcting the weights). So, in a way, what you started with initially, you should get the same in the end.

And here’s a good thread to read shared by Christian Simonis.

Topic		Replies	Views
How does a Deep Neural Network work? Neural Networks and Deep Learning	7	1204	May 8, 2022
Hand Written Image Recognition From Scratch Improving Deep Neural Networks: Hyperparameter tun	4	525	November 17, 2021
Regularization Intuition In Programming Assignment Improving Deep Neural Networks: Hyperparameter tun	2	518	July 13, 2021
Deep Neural Network - Application Neural Networks and Deep Learning week-4	13	46	September 5, 2024
Week 4 \| Building your Deep Neural Network Query Neural Networks and Deep Learning	3	535	February 15, 2022

Confused about Deep Network

Related topics