Can a LLM be approximated by products and addition of low-rank matrices?

ajeancharles · August 22, 2023, 3:53am

Can an LLM be approximated by 2 or more low-rank matrices? _ I understand that the activations terms could be a major challenge. Nevertheless are there low rank matrices whose products and addition would approximate a LLM?

aryan010204 · August 22, 2023, 4:00am

Absolutely. While low-rank matrix factorization can be applied to approximate specific weight matrices within a large language model (LLM), achieving a comprehensive approximation of the LLM’s functionality is intricate. This is due to the non-linear and intricate nature of the LLM’s activations and interactions, which may not be fully captured by the low-rank matrix representation, particularly for complex language processing tasks.

ajeancharles · August 22, 2023, 4:18am

Thank you, @aryan010204, for indulging my curiosity.

I am making a purely intuitive argument. Relu activations are piecewise linear (unless I use a fancy one). If I only use old classical Relu as my non-linear functions, why won’t the overall function (NN) be piecewise linear? The whole NN intuitively should have a piecewise linear approximation. I might have to stitch a bunch of matrices together the way a Relu is stitched together (a zero function for the negative values and a linear term for the positive value). It feels like one is dealing with a bunch of hyperplanes in a “positive octant” (what is an octant in 60,000 dimensions).

The only thing that remains is the softmax. Does it have a Taylor expansion (it has to)? There may also be a piecewise linear approximation to a softmax.

Topic		Replies	Views
Clarification on LoRA Generative AI with Large Language Models week-2	1	17	March 23, 2025
W2 "Neural Language Model" slide missing diagram Sequence Models	1	493	March 18, 2023
Fine tuning using LoRA method Generative AI with Large Language Models week-2	8	892	September 7, 2023
Week 1: Pretraining Large Language Models Generative AI with Large Language Models ai-discussions , large-language-model , llm	1	41	November 17, 2024
Week 2: Question about parameters Sequence Models	1	518	December 2, 2021

Can a LLM be approximated by products and addition of low-rank matrices?

Related topics