Well the shapes may be correct early in the iterations, but the error may happen when you hit the end of a row or column. Here’s the best post I know of that describes in words how the algorithm works here. Please have a careful look at that and then compare to what your code is currently doing.
Let us know how it goes.