Hi, Gabe.
In addition to Abdelrhaman’s excellent explanations, here are a few more thoughts and some links that I hope will add to the topic:
For the question of when you use the Keras Sequential API and when you use the Functional API, the key point is that the Functional API is more general. The Sequential API can only handle the case in which the way the “layers” you are adding are connected in the simplest way possible: each layer takes one input which is the output of the previous layer. If you have a simple setup like that to build, then you can use either API. But as soon as the case gets more complicated, e.g. one of the layers takes two inputs or the compute graph is not “simply connected”, the Sequential API is not longer sufficient and you no longer have a choice. You’ll see examples very soon in the Residual Net Assignment in Week 2 where you have no choice but to use the Functional API.
Also note that they really don’t give us much background on the two APIs in the notebook. Just a few sketches of examples. You can find it all in the TensorFlow documentation, but the better way to learn more would be to start with this explanatory thread from one of your fellow students which does a great job of explaining in more detail what the two APIs are capable of and how to use them.
On the question of whether the costs are high or low, note that the actual J values don’t really tell you that much. We really only use those as an inexpensive proxy for whether our convergence is working or not. The real way to evaluate the performance of your network at any point in the training is the prediction accuracy: compare the output of the model with the labels on both the training data and the test or cross validation data as appropriate. The percentage of correct predictions is the key evaluation metric for any ML/DL model. That’s all that really matters in the end, right?
Regards,
Paul