Creating and randomizing training, dev, and test data sets

Hi, Seth.

I think it’s a great idea to try applying the ideas from Course 1 to solve a new problem that is of interest to you. As you say, it will definitely help you understand the code base we have and also the concepts and how things work. And it will also give you that much more understanding and appreciation of why the things Prof Ng will be showing us in Course 2 are relevant and useful.

In the longer term Prof Ng will introduce you to TensorFlow, which is a higher level package that has “canned” routines for doing all the things that Prof Ng has showed us how to build ourselves directly in python and numpy here in Course 1. TF also has lots of additional functionality and is the way people normally build DNNs to solve real problems. But Prof Ng has a strong pedagogical reason for teaching us how to do build a DNN directly in python first: If you start by learning TF, then everything is just a “black box” to you. It is almost always the case that things don’t work very well the first time you try putting together a solution for a given problem. If you don’t have the kind of understanding that you get from seeing what’s really happening “under the covers”, then it’s hard to develop the intuitions for what to do when things don’t work the way you want. All that will be a major topic for Course 2.

Doing the kind of experimentation that you’re describing will definitely help give you the skills for that kind of problem solving. Even if it delays you a bit from proceeding with the rest of the courses, I bet you’ll find that it will be worth it. Having a better understanding of the Course 1 material will also give you a better vantage point for being successful in the rest of the courses. Give it a try and you’ll probably know pretty quickly whether you’re finding it useful or not.

Please let us know how it goes and if you come up with any cool solutions or new insights!

2 Likes