C1W1_Assignment.ipynb : price_in_hundreds_of_thousands

Can you point to the information in notebook that provides direction for specifying values in price_in_hundreds_of_thousands? It is unclear what the purpose of this guessing game. Thanks.

The assignment comes with hint instruction at the beginning

**Imagine that house pricing is as easy as:

A house has a base cost of 50k, and every additional bedroom adds a cost of 50k. This will make a 1 bedroom house cost 100k, a 2 bedroom house cost 150k etc.

How would you create a neural network that learns this relationship so that it would predict a 7 bedroom house as costing close to 400k etc.

Hint: Your network might work better if you scale the house price down. You donā€™t have to give the answer 400ā€¦it might be better to create something that predicts the number 4, and then your answer is in the ā€˜hundreds of thousandsā€™ etc**

So price in hundreds of thousands, is pointing to this hint to scale down housing price to a unit form based on the number of bedrooms, so the network train better as the scaling from 1st bedroom to 2 bedroom is 100k to 150k, the scaling of housing price 1 to 1.5 will allow the feature and target to be in the same unit value, allow better neural network training.

Regards
DP

In the future, you might give the forum search a try. Hereā€™s an exampleā€¦

https://community.deeplearning.ai/search?q=Price%20scaling

One of the threads returned by that isā€¦

which seems relevant to your question. There are others, for exampleā€¦

Hope it helps

EDIT: after reviewing the updated exercise notebook on github I see that the narrative markup is much more explicit than it used to be about the units for housing price (hundreds of thousands) than it once was. In the old days, the problem was almost always that models were predicting near 400 instead of near 4. That is what the threads I linked above discuss. My understanding now is that is not the issue raised by the OP in this thread. Rather, it seems to be an implicit requirement that the number of rooms training data vector is ordered and starts with 1.

There are a couple of ways one can implement including the order of the list.

Instead of using ā€œImagineā€ you could plainly state that as the actual problem being solved where the number of rooms should be in increasing order.

I havenā€™t run that exerciseā€™s code for quite a while, but am pretty confident the order of the room count training data doesnā€™t matter at all - as long as the corresponding house price vector matches. You could easily test this yourself by rearranging both room count and price, training a model, and then doing a prediction.

1 Like

May be this can help:

Your unittest is using np.allclose(n_bedrooms, features) to verify with no tolerance specified, meaning there is no way it is going to pass a list that does not match the expected.

Thanks for sharing that. For the record, that isnā€™t my unit test code - I didnā€™t write it and unlikely that anyone who did reads the forum on any regular basis. Also for the record, if that is an assumption made by the test code in my opinion it should be documented because , to the best of my understanding, one should not expect a model to depend on the order of its training inputs. Not at all clear to me why that would be included as a requirement in a learning exercise. Iā€™ll see if I can find someone to explain to us both.

Agree with your assessment. It reflects my point about having to remove the element of guessing and having to clearly state the expectation.

As part of the course update, the staff converted a lot of markdown hints into unit tests. Hereā€™s the public version of the notebook (unittests.py isnā€™t shared though).

After a couple behind the curtain exchanges, my takeaways are:

  • In general, training data record order doesnā€™t matter
  • In fact, sometimes shuffling training data is encouraged
  • This unit test does not reinforce those general learning objectives. Instead, it requires one specific order for the training data and throws an exception if it doesnā€™t find it.

A more general solution would have been to enforce that the room number and price vectors have the same length and capture the proposed linear relationship, but accepting them in any order (as long as both have the same order, ie treats room number and price as paired). In the absence of that, it seems polite to alert users to this requirement.

I donā€™t work for or represent DLAI in any capacity official or unofficial but hope this idea makes it onto the wish list for future enhancement. Cheers

Hi rocki. Thank you for the feedback. The instructions have been updated to clarify where to base the training data on. As for the unit test, the devs may have wanted to check for the values and expected that most learners will arrange it in ascending order. For now, we added a code comment asking them to arrange it that way. Hope this helps the next learners moving forward. Thanks again!

3 Likes

@chris.favila Thanks for the quick action.

@rocki I think the screen capture of the assertion failure was the key to moving this forward. Perhaps in the future, start there, with an explanation of what you observed and what, if any, actions on your part were an effective workaround. That will help community members and staff ensure the best overall learning experience. V/r , ai_curious

1 Like

Appreciate you all actively bringing in a resolution, and the feedback on the nature of input that would help your team. :+1:

2 Likes