I’m not getting expected cost value for the Gradient descent function in Exercise 05, the cost doesn’t seems to be changing much:
iters: 10 cost: 9.105433
iters: 20 cost: 8.422988
iters: 30 cost: 8.758872
iters: 40 cost: 8.333231
iters: 50 cost: 9.127138
iters: 60 cost: 8.514464
iters: 70 cost: 8.935792
iters: 80 cost: 8.442537
iters: 90 cost: 8.672371
iters: 100 cost: 8.554414
iters: 110 cost: 8.016827
iters: 120 cost: 8.074106
iters: 130 cost: 8.337460
iters: 140 cost: 8.211384
iters: 150 cost: 7.975324

Assuming you passed the other unit tests and forward_prop, compute_cost, and back_prop all produce expected results. That leaves the implementation of #update weights and biases

Maybe compare your code to the equations in the lecture/reading Training a CBOW Model: Backpropagation and Gradient Descent.

W_1 := W_1 - \alpha \frac{\partial J_{batch}}{\partial W_1}
W_2 := W_2 - \alpha \frac{\partial J_{batch}}{\partial W_2}
b_1 := b_1 - \alpha \frac{\partial J_{batch}}{\partial b_1}
b_2 := b_2 - \alpha \frac{\partial J_{batch}}{\partial b_2}

HINT: the partial derivative elements are computed in back_prop()