It turns out that for real Neural Networks (as opposed to Logistic Regression), there is no such thing as a convex loss function. The loss functions will always have local minima and saddle points, but it turns out that you can show that this is not really that much of a problem in most situations that we actually encounter. Here’s a thread which discusses that point in more detail and points to a paper from Yann LeCun’s group showing that there are reasonable solutions in most cases.
Ken has given the answer here for how to get the same training results, but it might also be worth popping up to a higher level and asking the question “why are you retraining your network”? It’s important to be clear that there are two completely different scenarios: 1) training the network and 2) using the trained network to make predictions. So if you come back a month from now, why are you retraining your network? Normally you would just be using the network with its existing trained parameters to make predictions. If the training data is still the same, then why bother training again? There are several potential reasons:
- The original trained network does not perform well enough when you use it to make predictions on “real world” data.
- You have acquired more new training data in the meantime that you think will improve the performance.
In both of those cases, you want and expect the training to give you different (better!) results. You want the results to be different. That’s the point, right? You may even need to tweak your hyperparameters (add neurons and/or add layers) in addition to getting more training data if you find that the network is not performing well enough. But if the network already works well enough, then just use it and don’t bother with retraining it.