nn.ReLU vs F.relu

Starting in this second course we suddenly seem to switch from nn.Relu() to F.relu(), seemingly without explanation. Why ? And what is the difference ?

2 Likes


Hope this helps. It’s the answer from the PyTorch discourse forum.

3 Likes

I found some additional context provided here:

The example in the linked article, and the explanation pasted above, suggest that the two are more or less interchangeable or a question of style. My understanding is that with the functional interface, you are explicitly invoking the function. With the class instance, you are invoking the object’s call() method, which likely forwards to the functional under the covers.

If you are building a model and want the activation function included in the forward prop automatically, you would only use the module style. If you are working at a lower level of abstraction and are controlling the activation invocation yourself, you can use either.

regards, k_

5 Likes

Hi,

I noticed the topic and wanted to add, that there is an aspect on this in the Course 3 on Quantization Aware Training (QAT):
In QAT you should use nn.ReLU instead of F.relu, since module-based ReLU can be registered and fused with preceding layers, while the functional version cannot participate in fusion. In other words, nn.ReLU plays nicely with quantization, F.relu does not.

Cheers

3 Likes

I haven’t looked under the covers to see what QAT does here, but my assumption is that, consistent with the content pasted by @lukmanaj above, it leverages the state maintained by the nn layer object. Whereas the function just operates on the Tensor it is passed and lacks additional context or capability. @Nevermnd, hopefully this helps resolve both the why and the what parts of your post. Cheers

2 Likes

Yes, thanks that helps.