C1W2_Assignment Gaussian Elimination; Bugs+Misleading things

[Referring to C1W2_Assignment Gaussian elimination]

Hi everyone,

I just finished the assignment. While it was certainly helpful for my programming skills and my understanding of the subject, I have some big critical points:

  1. The terms “reduced row echelon form” and “row echelon form” are not used correctly. The function titled reduced_row_echelon_form(A, B) does not return a reduced_row_echelon_form but only the row_echelon_form. The actual reduction happens in the back_substitution(M) function. This is majorly misleading for beginners to the topic
  2. The pivot definition at the beginning is super misleading: It says # Find the first non-zero entry in the current row (pivot).
    This is the definition of the pivot element, yes. But it is not, what is needed to be assigned to pivot here.

To make it clear: pivot = M[i, get_index_first_non_zero_value_from_row(M, i)] by itself is a correct definition of the pivot element, but just not what is needed here (judging from how pivot variable is used in the rest of the code). I think the term pivot is not the right one to use here. More properly would be something like current_element or similar.

  1. Most importantly though does the test for reduced_row_echelon_form (which should have a different name to begin with as explained above) have a bug. F.e. for the matrices:
    C = np.array([[1,2,3,4,5],[0,0,0,6,7], [0,0,8,9,10], [0,0,0,1,2], [0,0,0,0,3]])
    D = np.array([[1], [2], [3], [4], [5]])

reduced_row_echelon_form(C, D) does not yield the correct result for my implementation even though I passed all tests.

Actually, it is not only a bug in the test but rather an error in the whole exercise, which I would assume 99%+ of implementations have, since the template clearly nudges one to do this and the explanation also explains it in a wrong way.

The problem is this part:

“2. Given the failure of step 1, search within the row for the first non-zero number, which becomes the new pivot.”
Without the additional step of reordering the rows at the end this is not correct and situations like the one with matrices C, D which I showed above can occur. Correctly, it should be something like “Given the failure of step 1, repeat the procedure for the same row, but increment the currently searched column by 1.

I might be wrong about everything and will happily admit that if I am. Just wanted to let you know about this.

2 Likes

Here’s what I get for that input:

C = np.array([[1,2,3,4,5],[0,0,0,6,7], [0,0,8,9,10], [0,0,0,1,2], [0,0,0,0,3]])
D = np.array([[1], [2], [3], [4], [5]])
redCD = reduced_row_echelon_form(C,D)
print(f"redCD =\n{redCD}")
print(f"redCD solutions: {check_solution(redCD)}")

redCD =
[[ 1.          2.          3.          4.          5.          1.        ]
 [ 0.          0.          0.          1.          1.16666667  0.33333333]
 [ 0.          0.          1.          0.         -0.0625      0.        ]
 [ 0.          0.          0.          0.          1.          4.4       ]
 [-0.         -0.         -0.         -0.         -0.          1.        ]]
redCD solutions: No solution.

So you’re right that it would have been more aesthetically pleasing to have reordered the rows. But note that it is technically still in row echelon form meaning that it is an upper diagonal matrix.

I agree that they are a bit sloppy with the terminology here, but my recommendation is just to move on. Gaussian Elimination may be foundational for Linear Algebra, but I don’t think it’s really relevant for ML. At least I’ve never seen it come up in any of the actual ML courses here.

Thanks for the response.

The definition in the assignment says:

" As discussed in the lectures, a matrix in row echelon form adheres to the following conditions:

  • Rows consisting entirely of zeroes should be positioned at the bottom.
  • Each non-zero row must have its left-most non-zero coefficient (termed as a pivot) located to the right of any row above it. Consequently, all elements below the pivot within the same column should be 0."

Isn’t the second bullet point unsatisfied by redCD (your output to my input) ?

Yes, you’re right, that solution conflicts with the definition. So you are correct that it is a bug in the test cases and the grader, since both accepted my solution which produced that answer.

I’m not a mentor for this course, so don’t have access to the github repo to file a bug, but I will send a link to this thread to someone on the course staff.

Thanks for the detailed analysis.

Sure!

Thanks for your responses (in this thread and in general)