Week3 assignment issue

CNN course week3 first assignment instruction has issue:


the instruction says:
For this exercise, a box is defined using its two corners: upper left (π‘₯1,𝑦1) and lower right (π‘₯2,𝑦2), instead of using the midpoint, height and width.

But I think it should say lower left (x1, y1) and upper right (x2, y2).

Why do you think that? Note that in digital images, the coordinate system is not what you expect from normal analytic geometry: (0, 0) is the upper left corner of the image and as you increase h, you are going down and as you increase w you are moving rightwards across the image. If we have an image that is 1280 x 1024, then the lower right corner is (1280, 1024) assuming portrait mode.

Hi Paul,
even if you look at the formula which I use and passed test, x1,y1 should be lower left and x2,y2 should be upper right.

And even if you look at verification examples provided:
box1 = (2, 1, 4, 3)
box2 = (1, 2, 3, 4)

or
box1 = (1,1,2,2)
box2 = (2,2,3,3)

the numbers x1,y1,x2,y2: lower left and upper right: x1<x2; y1<y2

I tried formulas the other way around according to the hints phrase, but it did not work.

Did you read what I said in my previous post about the way the coordinates work? Note that they are just talking about an individual box. So consider the box (2, 1, 4, 3). (2,1) is upper left w.r.t. (4,3) if you use the orientation I described in my earlier post. Here is what it looks like when you graph it:
IMG_5247

Oh, sorry, in my picture I draw h upwards, not downwards. :slight_smile:
This is the cause of discrepancy.

because we are talking about x and y in the Jupyter notebook text, and y-axis is usually drawn upwards. Anyway, it is a minor thing.

As I explained, this is just the way the pixel coordinates work in image processing. I don’t know why it is backwards from the normal analytic geometry conventions, but that’s just the way it is and we have to deal with it. Presumably someone who wrote one of the first papers and/or software libraries for dealing with images did it that way and it stuck. I don’t know the specific history there, since my specialty was operating systems, not image processing, but I’ll bet the author(s) of that paper were probably at Bell Labs and worked in the same building as Kernigan, Ritchie and Thompson back in 1975.

ok :ok_hand: