Style function gram matrix : correlation vs. prevalence texture and patterns

Nanini · August 23, 2023, 3:10pm

In the last assignment of the CNN course, for the style transfer part, there is written the following :

"
𝐺(𝑔𝑟𝑎𝑚)𝑖𝑗 : correlation
The result is a matrix of dimension (𝑛𝐶,𝑛𝐶), where 𝑛𝐶 is the number of filters (channels). The value 𝐺(𝑔𝑟𝑎𝑚)𝑖,𝑗 measures how similar the activations of filter 𝑖 are to the activations of filter 𝑗

𝐺(𝑔𝑟𝑎𝑚),𝑖𝑖 : prevalence of patterns or textures.
The diagonal elements 𝐺(𝑔𝑟𝑎𝑚)𝑖𝑖 measure how “active” a filter 𝑖 is.
For example, suppose filter 𝑖 is detecting vertical textures in the image. Then 𝐺(𝑔𝑟𝑎𝑚)𝑖𝑖 measures how common vertical textures are in the image as a whole.
If 𝐺(𝑔𝑟𝑎𝑚)𝑖𝑖 is large, this means that the image has a lot of vertical texture.
By capturing the prevalence of different types of features (𝐺(𝑔𝑟𝑎𝑚)𝑖𝑖), as well as how much different features occur together (𝐺(𝑔𝑟𝑎𝑚)𝑖𝑗), the Style matrix 𝐺𝑔𝑟𝑎𝑚 measures the style of an image.*
"

I do not get it I am confused : What do they mean with 𝐺(𝑔𝑟𝑎𝑚),𝑖𝑖 and 𝐺(𝑔𝑟𝑎𝑚),𝑖𝑗 ? What do the lines and the rows represent in this final gram matrix ? I understand that we want to see the correlations between the filters (that contain the features/activations). But why are 𝐺(𝑔𝑟𝑎𝑚),𝑖𝑖 for correlations and 𝐺(𝑔𝑟𝑎𝑚),𝑖𝑗 for the prevalence of textures/patterns ?

Thank you all

paulinpaloalto · August 23, 2023, 5:03pm

Your description is exactly backward from what they said in the text. The point is that the Gram Matrix is a form of “correlation” matrix between the various filters (channels). So the correlations between a given filter i and different filters are the “off diagonal” elements with j \neq i. The correlation of the given filter with itself just gives you the squared norm of that weight vector. I’m not sure I understand what they mean by the magnitude of that squared norm being an indication of strong patterns or textures. If the weights learned are larger in a given filter, does that mean the pattern it is detecting is stronger or weaker (needs more of a boost from the weights to be recognized)? Or maybe the fact that there are a lot of elements that are not near zero means that interesting things are happening in multiple features in the input all at once. Not sure of the intuition there, but maybe someone more knowledgable will also notice this thread and comment.

Nanini · August 24, 2023, 10:43pm

Thank you.
Yes I hope because this last part is not clear to me.

RicoRuotongJia · October 9, 2024, 9:33pm

For those who are still curious, this question is about how similarity can be measured by the Gram Matrix. I wrote a blog post about this and I hope it could help.

In short, a style can be roughly summarized as having a set of patterns in the same spatial windows across an image. Some cases are: caligraphers have their own distinct strokes, painters perfer certain patterns in specific colors.

Topic		Replies	Views
How Gram matrix represents style of an image Generative Deep Learning with TensorFlow week-module-1	5	846	March 9, 2022
Gram Matrix Convolutional Neural Networks week-module-4 , coursera-platform	1	46	July 25, 2024
Why Gram matrixes give a good sense of style? Convolutional Neural Networks coursera-platform	4	671	May 19, 2021
A doubt on Gram matrix Convolutional Neural Networks coursera-platform	3	325	November 3, 2023
Style of an Image Convolutional Neural Networks coursera-platform	1	503	October 17, 2021

Style function gram matrix : correlation vs. prevalence texture and patterns

Related topics