Linear Algebra in DeepLearning

Hello everyone,
Here are some of the mathematical topics I would like to share with you all.
Scalars, Vectors, Matrices & Tensors
Scalar: A single number (e.g., n∈N).
Vector: A 1D array (x=[x1,x2,…,xn]), represents points in space.
Matrix: A 2D array (A∈Rm×n), used in transformations.
Tensor: Generalized multi-dimensional arrays, essential for deep learning models.

Matrix Operations in ML & DL
Hadamard Product (A⊙B) → Element-wise multiplication (used in LSTMs, image processing).
Dot Product (x^T y) → Measures similarity (cosine similarity in NLP, embeddings).
Linear Equations (Ax=b) → Used in solving ML optimization problems.
Identity & Inverse Matrix: AA⁻¹=In, but A⁻¹ is avoided in ML due to numerical instability.

Norms & Distance Measures
L1 Norm (∥x∥1) → Sum of absolute values, used in sparse models & feature selection.
L2 Norm (∥x∥2) → Euclidean distance, used in regularization (Ridge Regression, weight decay in NN).
Max Norm (∥x∥∞) → Max absolute value, controls outliers in optimization.
Frobenius Norm → Measures matrix size, used in PCA & covariance matrices.

Special Matrices & ML Applications
Diagonal Matrix (Di,j=0,∀i≠j) → Efficient in computation.
Symmetric Matrix (A = A^T) → Used in covariance matrices.
Orthogonal Matrix (A−1=AT) → Important in eigen decomposition, PCA, SVD.
Unit Vector (∥x∥2=1) → Used in directional scaling.

Happy to share my understanding. Do connect if any doubt or want to share knowledge on the topic.
Best Regards,
Arif

7 Likes

Here is a thought:

It seems we use a “matrix” object either as

  • a 2-dimensional storage space of scalars (actually a tree of height 2, row first, column second, with scalars as leaves), similar to the usual multi-dimensional arrays that have been with us, with various indexing conventions, since early FORTRAN
  • a representation M of a linear transformation from a vector space to a vector space: x = M*y

The same of a “tensor” object, but the idea of using it as storage space is much more prevalent. A “tensor” object as a representation of a multilinear transformation that takes multiple vectors as input and produces another tensor or vector or scalar as output seems rare. In particular for example if the depth is storing the values of the R, G, B channels that would make no sense.

Also, old books:

5 Likes

@Arif, here is a warning about self-promotion in forum posts (Re: Code of Conduct).

I suppose we can survive a medium post :thinking:

Self-promotion would be “selling stuff”

I have no clue what any of that is. My challenge is to now go figure it out. :joy::saluting_face:

@AcesRwilD05, don’t be worried. As you move through the introductory courses, the information you need will be presented.

I was just sharing my understanding, with no intention of promotion. Nevertheless, I’ve removed the link. I appreciate you updating me on this.

1 Like

I do agree

@Tmosh, Thanks! Yes im getting to understand it much better so far in python for beginners with andrew. I was already taking another course with him when I found this one. It has been very helpful so far. Very intriguing stuff. A tad more sophisticated then the java script I remember as a kid. Lol

1 Like

I’m hoping for a future where we can use Julia, but currently we are in a different timeline :joy:

Scalar is just a natural number??

1 Like

Scalar is just a natural number??

Yes, it is a “value” with no further structure.

It could be a natural number (i.e. an integer >= 0), a real, a complex number etc.

See also this (where links to physics are brought in, too):

1 Like

Hi,

I’d like to get your advice on your job experience and the courses you’ve taken. Can you help me please?

I’ve connected with you on LinkedIn

1 Like

Hi @rezakhanahmadi342
Sure we will connect and discuss on this.

Hello everyone,
Today I would like to share about eigenvectors and eigenvalues.

Linear Algebra talks about transformation functions. In this context, eigenvectors are actually vectors (different from the null vector) that remain unchanged during a transformation, even though the vector may get stretch or shrink. The stretching or shrinking of eigenvectors by a certain number (scalar) is the eigenvalue.

Note: Pure rotation matrices do not have eigenvectors, as they do not leave any nonzero vector in the same direction. They also do not have real eigenvalues unless they involve scaling.

Mathematics of Eigenvalues and Eigenvectors
For a square matrix A, an eigenvector v and its corresponding eigenvalue λ satisfy the equation:
Av=λv
where:
A: Square matrix
v: Eigenvector (a non-zero vector)
λ: Eigenvalue (a scalar)

This equation means that when the matrix A acts on the eigenvector v, the result is a scalar multiple of v.

Graphical Representation of Eigenvalues and Eigenvectors
A vector v in its original direction.
After Transformation

  • the matrix A transforms v into Av.
  • If v is an eigenvector, Av will lie on the same line as v, scaled by the eigenvalue λ.
  • If λ > 1 , Av should be longer than v.
  • If 0 < λ < 1, Av should be shorter than v.
  • If λ < 0, Av should point in the opposite direction.

Finding Eigenvalues and Eigenvectors
Let us consider a Matrix:
A = ([3 1][0 2])

Step 1: Find the Eigenvalues

The eigenvalues λ are found by solving the characteristic equation:
det(A−λI)=0,
where I is the identity matrix.
Subtract λI from A:

AλI=([3−λ 1]​[0 2−λ]​).

Compute the determinant:

det(AλI)=(3−λ)(2−λ)−(1)(0).

Simplify:

det(AλI)=(3−λ)(2−λ).

Set the determinant to zero and solve for λλ:

(3−λ)(2−λ)=0.

The solutions are:

λ1​=3,λ2​=2.

So, the eigenvalues are λ1​=3 and λ2​=2.

Step 2: Find the Eigenvectors

For each eigenvalue, solve the equation:
(AλI)v=0,
where v is the eigenvector.

Eigenvector for ( λ1​= 3 ):

Substitute λ=3 into AλI:

A−3I=([3−3 0][​1 2−3]​)=([0 ​1][0 −1]​).

Solve (A−3I)v=0(A−3I)v=0:

([0 ​1][0 −1]​)[v1 ​v2]^T​​=[0 0]^T.

This simplifies to:

v2​=0.

The eigenvector v is of the form:

v=[v1 0]^T.

Let v1​=1, so the eigenvector is:

v1=[1 0]^T).

Eigenvector for ( λ2​ = 2 ):

Substitute λ=2 into AλI:

A−2I=([3−2 1][0 2−2]​)=([1 1][0 0]​).

Solve (A−2I)v=0:

([1 1][0 0])([v1 v2]^T)=([0 0]^T).

This simplifies to:

v1​+v2​=0⇒v1​=−v2​.

The eigenvector vv is of the form:

v=([−v2 ​v2]^T​​).

Let v2​=1, so the eigenvector is:

v2​=([−1 1]^T​).

  • Eigenvalues:

  • λ1​=3,λ2​=2.

  • Eigenvectors:

  • v1=([1 0]^T),v2=([−1 1]^T).

Regards,
Arif

1 Like