How to update first layer weights

I’m trying to make a neural network without using any deep learning library that recognizes numbers in the mnist database. Its structure is: 784 input neurons, 10 hidden neurons (only 1 hidden layer) and 10 output neurons. There’s 10 biases for the hidden layer.

I think I know how to update the last layer weights, but not the first ones as the last layer’s weights influence the result. I dont know how to update biases neither. If I made any mistake in the last layer update, please let me know.

Here’s the code:

#forward propagation
def forward(inp, w1, w2, biases):
    hidsRes = []
    outRes = []

    for i in range(len(w1)):
        n =, w1[i])

        n += biases[i]
        n = relu(n)


    for i in range(len(w2)):
        n =, w2[i])


    return softmax(outRes)

def back(avgResult, w1, w2, lr):
    for i, w in enumerate(w2):
        w2[i] += lr * avgResult[i] #I only update the last layer based on the average error of each neuron

def train(inps, hids, outs, randomWeightDiff, batchs, gens, lr):
    w1, w2, b = initNn(inps, hids, outs, randomWeightDiff)

    #loading the mnist dataset
    x_train, x_test, y_train, y_test = getData()

    for gen in range(gens):
        errors = []

        x_train, y_train = shuffle(x_train, y_train)
        for batch in range(batchs):  
            prediction = forward(tolist(x_train[batch].tolist()), w1, w2, b)
            y = y_train[batch]

            target = [0 if i != y else 1 for i in range(10)]

            errors.append([prediction[i] - target[i] for i in range(10)])


        avg = [sum([errors[i][j] for j in range(len(errors))]) / 10 for i in range(10)]

        back(avg, w1, w2, lr)
        print("Generation {gen} \n" + f"{avg}")

train(784, 10, 10, 2, 100, 1000, 0.01)

I tried simulating a lot of neural networks and mutating the best ones, but it was too slow and it was not working.

By the way, I didn’t learn advanced maths yet.

It appears that the method you’re using is a genetic evolution model.

That is a method, but it isn’t the one that is used in DLAI courses.

To implement backpropagation in native code without any ML packages, you need to know a bit of calculus so you understand how the gradients are computed, and then have a bit of understanding about cost optimizers that are based on gradient descent.

The Machine Learning Specialization teaches these fundamentals.

1 Like

@bonjour I would also just add, where traditionally I am more of a C guy, one of the great things I learned from the DLS course is about vectorization.

I mean ‘for-loops’ are really nothing to be scared of in C, but in Python they are horribly slow because of the way the interpreter handles typing.

Alternatively, do check out Numba.

This isn’t fast on the first run, but for subsequent ones it will basically ‘precompile’ your loops to low level C and then invoke that rather than the Python code on each call.

Just result = much faster :smiley:

1 Like