General question about weight matrices in Dense Layer

romcs · May 8, 2024, 8:46pm

I have a question about weight matrices in simple dense layer.
Since the matrices are general, when the input/output dimensions are quite high, the number of parameters becomes huge.
What if we could restrict to particular types of matrices? This way, the dimensions of the matrix stay the same, but the number of parameters can be way smaller.
For example, rotation matrices in some dimensions, or combinations of them?
More generally, what if we restrict the form of matrices to certain matrix group representation? Would certain groups be better for particular layers or tasks in general?

I’ve made a few experiments with SO(2) and SO(3) groups, and wrote about it here:

TMosh · May 8, 2024, 8:51pm

You should not force any assumptions on what each layer is going to learn. The optimization will adjust the weights to minimize the cost.

romcs · May 8, 2024, 8:55pm

Sure, but isn’t the CONV layer or any other specific layer exactly forcing particular assumptions?

TMosh · May 8, 2024, 9:35pm

That just specifies a convolution - it doesn’t constrain what the convolution weights will learn.

Topic		Replies	Views
Unitary or circulant weight matrices Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	499	February 3, 2022
Possible Discrepancy in Markdown: Assignment NLP with Sequence Models week-module-1	8	527	December 29, 2022
General implementation of forward propagation - shape of W Advanced Learning Algorithms week-module-1	9	452	February 17, 2024
Weight Parameters Dimensions neural network Advanced Learning Algorithms week-module-1	2	371	December 19, 2023
Parameter dimensions in Practice Lab Advanced Learning Algorithms week-module-2	2	24	August 15, 2024

General question about weight matrices in Dense Layer

Related topics