Course 2 - Week 3 assignment - Exercise 3: one_hot_matrix

Khiem_Viet_Ngo · January 23, 2022, 4:26pm

For week 3, exercise 3, regarding the function

def one_hot_matrix(label, depth=6), here’s what I have:

{moderator edit - solution code removed}

I have spent an incredible amount of time for this and have looked at the TensorFlow documentation online but it’s not entirely clear. I know my shape = tf.shape(one_hot) is not right, I tried to print out this shape to get the dimension but it didn’t work. TensorFlow online documentation is not comprehensive. I got

AssertionError: Wrong output. Use tf.reshape as instructed

paulinpaloalto · January 23, 2022, 4:46pm

What is the point of reshaping it to the shape that it already has? That’s what the code you show does, right?

What you need is to make sure that the return value is a “rank 1” tensor (what you’d call a 1D array in numpy) with dimension [depth,]. Try using that latter expression as your target shape on the reshape.

The other caveat here is that the unit test for this function is not strong enough: it will allow you do things that will then fail on the later cells in this notebook. A bug is filed about that, but no fix is available yet.

paulinpaloalto · January 23, 2022, 5:21pm

If you look at the assertion that is actually failing, it uses numpy “allclose” to compare your result to the expected output. The problem is that allclose uses broadcasting, so your shapes don’t really have to be correct. They are comparing to a 1D array and all that’s necessary is that your array be “broadcastable” to that shape. So I’m worried that maybe your values are actually wrong as well, although I don’t see how that is possible from the code you show.

If the suggestion in my first reply doesn’t get you there, it would be worth actually showing the full output you get when you run the test cell for one_hot_matrix.

Khiem_Viet_Ngo · January 23, 2022, 5:40pm

Thanks Paul! I got passed it a few minutes after receiving your response! It turns out that [-1,] is used for making it a 1-D vector. I’m on Exercise 4 now.

Khiem_Viet_Ngo · January 23, 2022, 8:38pm

I’m on Exercise 6, in def compute_cost(logits, labels), here’s what I have:

loss = tf.keras.losses.categorical_crossentropy(labels, logits)
cost = tf.reduce_mean(loss)

Am I close ? I notice that the cost function for softmax is different from that for binary classification problems but I assume that the softmax cost function is taken care of by the

loss = tf.keras.losses.categorical_crossentropy(labels, logits)

AssertionError: Test does not match. Did you get the mean of your cost functions?

paulinpaloalto · January 23, 2022, 9:49pm

The problem is that you are passing in the “logits”, meaning that you have not done the softmax activation function on the outputs yet. So you need to tell the loss function that the inputs are logits, not activations, so that it can do the softmax for you. The way you do that is by using the from_logits argument. Prof Ng always recommends doing it that way. Here’s the docpage for the loss function. It works the same way when you are doing binary classifications and using sigmoid + binary cross entropy loss.

paulinpaloalto · January 23, 2022, 9:52pm

Also I hope you remembered to transpose both the labels and the logits. You don’t show that code.

Khiem_Viet_Ngo · January 23, 2022, 11:17pm

Thanks Paul!

I don’t feel that we had enough coverage on TensorFlow. I find it much easier if I were to do it in Octave or Matlab. For example, I wanted to do a simple thing such as checking matrix dimension to decide if I should transpose the matrix before passing it in, I didn’t see a straightforward way to do so with TensorFlow, nor with Numpy. I think there should be a video to cover these fundamental things.

Khiem_Viet_Ngo · January 23, 2022, 11:55pm

This is what I have:

predicted_probabilities = tf.keras.activations.softmax(logits, axis=-1)

loss = tf.keras.losses.categorical_crossentropy( tf.transpose(labels), tf.transpose(predicted_probabilities) )

cost = tf.reduce_mean(loss)

From your previous post, it seems that you suggested me to use this function to compute loss

tf.keras.losses.CategoricalCrossentropy( from_logits=False, label_smoothing=0.0, axis=-1, reduction=losses_utils.ReductionV2.AUTO, name=‘categorical_crossentropy’)

but I’m not sure how to pass logits to this function …

paulinpaloalto · January 24, 2022, 12:52am

You can do the manual application of softmax, although I’m not sure your implementation is correct. There again, you need to understand what that function expects in terms of the order of the dimensions of its input.

But you are doing things the hard way. The point I was hoping you would pick up from the documentation is that saying from_logits = True is how you tell the cost function that the inputs are logits and that it needs to apply softmax for you. That’s way easier than doing it the way you have done. You also don’t need all those extra arguments about label smoothing, axis, reduction and name.

Khiem_Viet_Ngo · January 24, 2022, 12:56am

Thanks! I just got All Tests Passed as soon as you got this message but I’m looking at your words closely for the next course. Will courses 3, 4, and 5 involve TensorFlow heavily ?

paulinpaloalto · January 24, 2022, 12:56am

There is no programming in Course 3, but C4 and C5 use TensorFlow quite heavily.

Khiem_Viet_Ngo · January 24, 2022, 1:12am

I tried your way, setting from_logits = True and omitting other parameters, it didn’t work. It’s easy for you because you are familiar enough with TensorFlow! My way is not really the hard way, one or two extra lines of code are not a big deal, they help clarify things and make it easier to follow and remember.

Thanks for everything!

paulinpaloalto · January 24, 2022, 1:13am

You must have made some other mistake in the from_logits = True case. It works fine for me. Just keep that in mind: that is the way we will always do things going forward. Having the loss function apply the activation also is more efficient and more numerically stable.

Khiem_Viet_Ngo · January 24, 2022, 1:15am

OK, I will try this separately after installing TensorFlow. Thanks.

Marshall_Mayberry · July 24, 2022, 1:37am

Hello, Paul,

I don’t know how to initiate a question, (I’m not lazy, I just can’t find out how), so I’m using this thread, which relates to my question.
Since I’m not allowed to show my code, I’m including just an error screen shot:

From what I understand, the label is the constant 1, which a one-hot encoding over 4 categories (i.e., depth) should be the vector (float) [0,1,0,0], which it is calling the wrong output. (And, of course, I am using tf.one_hot.)
This seems straightforward enough that the code the answer comes right from the link that is given in the instructions [tf.one_hot(labels, depth, axis=0)](tf.one_hot | TensorFlow Core v2.9.1) is exactly the code that I understand is required (except that I replace labels with label). I’ve perused the documentation and find nothing else that seems to be required or relevant.
Of course, before bothering you with this issue, I experimented with various constants and depths, and that only seemed to confirm my impression that the code should be working correctly. It’s also possible that it’s a type problem, but I don’t see particularly why.
Any suggestions?
BTW, I went ahead with as much of the rest of the program to see what else I could do, and it looks like I’ll have more questions.
I’m addressing you since you were in this thread, and we’ve corresponded before, but I’ll take whatever advice I can get from whomever cares to offer it.

Thanks in advance.

Marshall_Mayberry · July 24, 2022, 1:47am

Aaaand guess what. I just fixed it. Apparently, I was overspecifying the dimensions by requiring a column matrix in the reshape function. Just passing [depth] instead of [depth,1] worked for both tests. I don’t know how I should have known this, except for my meager familiarity with underspecified shape parameters that allow for broadcasting.
But again, don’t be surprised if I have to pipe up again.
Thanks.

paulinpaloalto · July 24, 2022, 4:09am

Hi, Marshall.

I’m glad to hear that you were able to find the solution under your own power.

It’s fine to tag onto an existing thread on a relevant topic. If you want to create a new thread, first select the appropriate category and subcategory and then you should see a “New Topic” button in the upper right corner as highlighted in the rectangle in this screenshot:

The category and subcategory are highlighted in the oval in the upper left. Note that Discourse has some “safety” rules such that they won’t let new users create threads until they’ve established a level of trust by behaving responsibly. So if you don’t see the “New Topic” button, it might be because you haven’t done enough activity on the forum yet. But you’ve made a number of posts by this point, so I’d be a little surprised if that were the issue. Have a look and see if you can see the “New Topic” link as shown above.

Regards,
Paul

Marshall_Mayberry · July 24, 2022, 8:34pm

Hi, Paul,
Thanks for the instructions on how to start a new topic. Also, I benefitted from your advice for Khiem_Viet_Ngo in this thread, and was able to finish Course 2 with no further ado. However, I do have one suggestion that I’d like to offer. Perhaps it should be explicitly noted to students for this assignment that it’s important that they use Tensorflow 2.3.0. I’ve 2.8.0 installed in my virtual environment that I use for courses and projects. As a result, I was getting different results on Assignment #5, that, in retrospect, should not have been that surprising, given that everything up to that point worked fine. Of course, in hindsight, different Tensorflow versions would lead to anomalous behavior despite the correct seeding of the random generator; the code, after all, has changed. When I took my work back to the browser, I got the expected results, which was quite a relief, because I didn’t have the slightest clue as to what was wrong.

paulinpaloalto · July 24, 2022, 11:05pm

Hi, Marshall.

Thanks for your thoughts on this. Yes, it is definitely the case that everything mutates very rapidly in this space and in python packages in general and that can cause “versionitis” problems. Whereas it is in the nature of online courses like this that they are published at a particular point in time and get “major” upgrades that would deal with things like changing package versions typically only every couple of years at most. The last such upgrade for DLS was in April of 2021, when they rewrote things using TF2 instead of TF1. I can try suggesting that they add something about this, but it sort of opens a can of worms: how far do they have to go in addressing what it takes to get things to actually work in every possible student’s environment? It would not surprise me if they elected just to drop the subject. In fact, I don’t remember that they ever say anything about running the notebooks in a different environment, do they? Please correct me if I just missed it …

There have been a number of threads on Discourse around the question of how to run things in your own environment. E.g, this one or this one or this one.

Topic		Replies	Views
Improving Deep Neural Networks-Week3 TensorFlow Introduction Improving Deep Neural Networks: Hyperparameter tun coursera-platform	18	621	March 8, 2022
Week 3, ex. 3 one hot matrix problem Improving Deep Neural Networks: Hyperparameter tun week-3 , coursera-platform	1	243	January 9, 2024
Course 02 Week 3 ; use tf.one_hot error, Improving Deep Neural Networks: Hyperparameter tun coursera-platform	5	545	September 3, 2022
TensorFlow introduction Assignment - getting shapes correctly Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	500	August 31, 2022
Course 2 Week 3 Tensorflow 2.3 Improving Deep Neural Networks: Hyperparameter tun coursera-platform	4	593	August 14, 2021

Course 2 - Week 3 assignment - Exercise 3: one_hot_matrix

Related topics