Efficiently Updatable Neural Networks

MichaelL · November 17, 2024, 3:53pm

One concept that I’ve seen in Shogi and then Chess (Stockfish) is the idea of NNUE Models, that is ‘Efficiently Updatable Neural Networks’.
(Can’t include links, but the basic premise is you have a feature set and subsequent invocations of your model will only change the input features a very very small amount, say flipping one binary input from 1 to 0 say).
I’m a complete beginner, but in a different field, may have a similar use-case where I know between invocations of the model, my input parameters may vary only slightly. In the case of NNUE in both Stockfish and Shogi - the NN implementations are hand crafted (and get into low level code, e.g. taking advantage of AVX2 etc, shifting data to/from GPU would otherwise kill performance). I was wondering if anything like this optimisation has been done in Tensorflow/PyTorch? (And, any examples).

Deepti_Prasad · November 17, 2024, 8:01pm

hi @MichaelL

is it like you want to train a model parallely??

Also can I know what to do you mean by this

also can I know the binary feature of flipping is based on what criteria, based on updating efficient neural network implementation??

if you are stating about avx2 type optimization, then I suppose they surely do in pytorch, where data is distributed on multiple GPUs during training, where data replicate the model into each GPU and the split the input batches into smaller batches to be processed independently on each GPU. After processing, the results are gathered and combined, and the backward pass is performed to update the model parameters.

Even in tensorflow, this could be done in batches and using transfer learning with selective model layer and using the base model and retrain with the update model parameters or any new features.

Hope this helps!

Regards
DP

MichaelL · November 17, 2024, 11:28pm

No, It isn’t during the model training. Its once you’ve built a model.

You have some predictions to make, but you know up-front that the input parameters won’t change much between predictions hence the ‘efficiently updatable’.

But the input features can be binary switches say. That’s one thing. You may fined that you have a set of binary features and between predictions the feature set may change very slowly…

Deepti_Prasad · November 17, 2024, 11:47pm

even in this first the model is built, the efficient updating of parameters would be based on how the model trained on multiple GPUs independently, and then combining the results to get updated parameters.

I still haven’t got response to my query on hand made nn implementation and low level code, please elaborate on this and also when you state predictions and efficient updating, are you talking about ai models or statistical models?

can you please provide an example scenario of what you are stating about binary feature switch? i cannot comprehend on this?? any feature to switch from 1 to 0 or 0 to 1 would mean you’re creating a model that would switch feature dimensionality???

Topic		Replies	Views
Week 3 Exercise 7 Neural Networks and Deep Learning coursera-platform	2	518	August 25, 2022
Implementation of Neural Networks from Scratch AI Discussions	2	117	August 7, 2022
Gradient descent in NN: Order of updating weights in different layers Advanced Learning Algorithms week-module-2	16	1054	August 31, 2023
Planar data classification with one model nn_model fails Neural Networks and Deep Learning coursera-platform	10	589	July 16, 2021
W3_A1_Update_parameters_Test fails Neural Networks and Deep Learning coursera-platform	17	2376	December 17, 2021

Efficiently Updatable Neural Networks

Related topics