Is it possible to have a mix of activation functions in a hidden layer

Is it possible to specify different activation functions for the units in a hidden layer using Tensorflow?

I tried just asking this question to google and got what looks like a good answer:

“does tensorflow support mixing activation functions in a single hidden layer”

The answer says that the Dense API doesn’t directly support that, but they show how to create the same effect with a bit more complex setup.

One day I will create a new Dense class that supports the specification of activation function for each unit in a layer. It might be interesting to combine different activation functions to leverage the pros of each one and maybe provide better predictions and faster training.

A reasonable first step down that path would be to use the techniques described in that google answer to try some experiments with such mixing and see if and when it provides any of the advantages that you hypothesize.

If your research is fruitful, you can publish the results and perhaps we will then be recommending Learmonth layers, instead of the traditional Dense layers. :nerd_face:

2 Likes

I’m a bit reluctant to do that given the “…complex setup…” you mentioned.

Creating another Dense class which accepts multiple different activation functions is quite trivial.

You can create a custom dense layer like shown here.

I cannot see anyway from following that link how to specify activation functions for specific units. Have you succeeded in doing this through this link?

I think it would be really cool if the Dense class could be modified to have a parameter to specify one activation function for all units in one hidden layer OR a list of different activation functions specifically mapped to each unit in that hidden layer.

Shouldn’t be too difficult to do that then I can experiment to see if a mix of activation functions in a layer has a beneficial impact on prediction results and training time.

The link shared with you shows how to create a custom layer. You can do anything with the input, including custom activations.

That said, Dense layer does have an activation parameter which can be set by name or to a custom function where you can chain activation functions as well

What do you mean by “…chain activation functions…”?

I think developing an algorithm which optimally selects the mix of activation functions for the best outcome is key.

You can do anything to generate output from the input.

I’m not really sure what you mean by “…anything…”. Can you be more specific?

Anything refers to any transformation.