When creating a post, please add:
- Week # must be added in the tags option of the post.
- Link to the classroom item you are referring to: Nothing specific
- Description (include relevant info but please do not post solution code or your entire notebook):
This is a general question for the community. With any given equation in the lectures, what is the general thought process for translating it into Python? For many people, programming in something like Python may be second nature, but is there a repeatable process that you use as a framework, ie commenting in pseudo code, and then breaking it down into parts.
It’s a worthwhile question, but not an simple one to answer in the fully general case. The first and most fundamental step is that you need to understand what the math is saying. What are the operands and what are the mathematical operations that are specified between them? Once you understand that, then and only then do you turn to the python side of the question. How can I express those mathematical operations in python syntax and/or numpy calls? Of course since we’re doing Machine Learning here, the operands will frequently be vectors and matrices and you may have several choices about how to express the mathematical operations in python. You can start with using just python with for loops. Once you have a correct implementation using loops, then the question is could you use numpy calls that are “vectorized” to express the same mathematical operations in a more efficient way.
That’s probably the best I can do without a concrete example. There are just too many possibilities.
Thank you for the thoughtful response. It seems to take practice and incremental broadening of understanding, and lots of trial and error. For example, saying J = J + 1 looks bizarre mathematically, but it’s used in coding. I also found superscripts and subscripts can mean very different things if enclosed in brackets, and case means something very specific in dealing with an array or one instance.
Yes, it’s a great point that python is not math. The “equal” sign means something completely different when used as the “assignment” operator which is what makes
j = j + 1
look so strange if you’ve only ever seen math and not programming before. Of course there is also the Boolean equal operator in python which is “==” and means something closer to the mathematical meaning of =.
But that kind of thing is pretty quick to “grok”. You’re right about subscripts and superscripts and all that. The other important thing to realize is that these “notational conventions” and other conventions like whether vectors are column vectors or row vectors may be specific to different authors and presenters. Professor Ng uses a very specific set of conventions here and you will soon understand those, but some of them are not universal even in the ML field.
The main case I can think of there is given an input data set with a number of “samples”, suppose that each input sample is a vector with n elements and there are m samples. If we call one sample x, it will be a column vector with n elements, so the dimensions are n x 1. But if we “stack” them together into a matrix X, it can have the samples either as columns (so that X will be n x m) or as the rows (meaning X will be m x n in that case). That’s a case in which the notation is not consistent even within the courses here. In MLS, I think he uses the m x n orientation. Whereas in DLS, he uses the n x m orientation at least until we get to DLS C4 and start using TensorFlow. 
So attention is always required. It can be useful just to print the shapes of the input objects if you’re ever in any doubt about the conventions being used in a given instance:
print(f"X.shape = {X.shape}")