Are the weights in a neural network always of type Float?

It never became completely clear to me what data type the weights in a neural network have. I guess it is float most of the time. But could it also be int sometimes? Or maybe something different?

I guess the use of the sigmoid and relu functions points clearly in the direction of floats, but what if i have an application where i want int’s as an output?
But i am really just asking out of interest.

Weights are always real (floating-point) numbers. Floats can cover integers as well.

There are easy ways to deal with that. E.g. in the case of a binary classification, we take the output of sigmoid and just round it to an integer by writing:

prediction = (a > 0.5)

In the case of a multiclass classifier, you can do a similar thing with the softmax output, which gives you a probability distribution on the possible output classes: it’s easy to convert that to a “one hot” vector giving the predicted class and then convert that to a categorical value (0 to C-1, where C is the number of classes).

Hello @Michael_Weingran

In gradient descent, we must use float for the weights during training. After training, however, in order to reduce the size of the model, we can give up some model performance by converting the weights (e.g. 32-bit float) into integers (e.g. 8-bit integer). For example, we do this before downloading the model to a microcontroller.

Asking for integer output is a different thing, because we can have both float weights and integer outputs at the same time. @paulinpaloalto suggested a way for classification, as for regression, you may choose to round the floating point output to the nearest integer. Both are post-processing of the outputs after the training is finished.

Cheers,
Raymond

You’re giving me flashbacks to a project using a 4-bit microprocessor to do 16-bit math without access to a high level language, and the only way to re-scale numbers was by rotating them to the right bitwise in an accumulator.

I didn’ thought of that. You are absolutely right. Gradient decent with integers would probably not work at all. Thanks for that.

Very good point, @Michael_Weingran :slight_smile:

Cheers,
Raymond