Linear Algebra

1. What is Linear Algebra? 2. Vectors 3. Matrices 4. Matrix Operations for CS 5. Determinant 6. Inverse Matrix 7. Eigenvalues and Eigenvectors 8. Linear Transformations 9. Practical CS Applications 10. Practice Quiz

1. What is Linear Algebra?

Linear algebra is the branch of mathematics that deals with vectors (ordered lists of numbers), matrices (grids of numbers), and linear transformations (functions that move, stretch, and rotate space). If regular algebra is about solving for one unknown number, linear algebra is about solving for many unknowns at once -- and understanding the structure of the space they live in.

Think of it this way: a single variable x is just a number on a number line. But in the real world, data almost never comes as a single number. An image is a grid of pixel values. A point in 3D space has three coordinates. A neural network has millions of parameters. Linear algebra gives you the tools to work with all of that at once, efficiently and elegantly.

Think of it like this

Regular algebra deals with single numbers. Linear algebra deals with lists and grids of numbers. That's it at its core. A vector is a list, a matrix is a grid, and the operations tell you how to combine and transform them.

Why CS Needs Linear Algebra

CS Field	How Linear Algebra is Used
Machine Learning / AI	Neural networks are built from matrix multiplications. Training involves gradient vectors. Data lives in high-dimensional vector spaces.
Computer Graphics	Every rotation, scaling, and projection in 3D graphics is a matrix operation. GPUs are literally designed for matrix math.
Game Development	Character positions are vectors. Camera angles, physics simulations, and collision detection all use linear algebra.
Data Science	Datasets are matrices. PCA (dimensionality reduction) uses eigenvectors. Recommendation systems use matrix factorization.
Image Processing	Images are matrices of pixels. Filters (blur, sharpen, edge detect) are matrix operations called convolutions.
Cryptography	Many encryption schemes (like Hill cipher) rely on matrix operations and modular arithmetic.

Warning

Linear algebra has a reputation for being abstract and hard. That reputation comes from how it's traditionally taught -- theorem-proof style with no applications. On this page we focus on what things mean and how they're used in code. You will understand this.

2. Vectors

A vector is an ordered list of numbers. That's the whole definition. In 2D, a vector has two components. In 3D, three. In machine learning, vectors can have thousands or millions of components.

v = [v₁, v₂, v₃] (a 3D vector)

Examples:
Position in 2D: p = [3, 5]
RGB color: c = [255, 128, 0]
Word embedding: w = [0.2, -0.5, 0.8, 0.1, ...]

Geometrically, you can think of a 2D or 3D vector as an arrow pointing from the origin to a point. The vector [3, 2] points 3 units right and 2 units up. But vectors don't have to be spatial -- they can represent anything: colors, audio samples, user preferences, word meanings.

Vector Addition and Subtraction

To add or subtract vectors, you just add or subtract their corresponding components. Both vectors must have the same number of components.

a + b = [a₁ + b₁, a₂ + b₂, a₃ + b₃]
a - b = [a₁ - b₁, a₂ - b₂, a₃ - b₃]

Example: Vector Addition

Problem: Add vectors a = [2, 5, -1] and b = [3, -2, 4]

a + b = [2+3, 5+(-2), -1+4]
a + b = [5, 3, 3]

CS context: If a player moves [2, 5, -1] then [3, -2, 4], their total displacement is [5, 3, 3].

Scalar Multiplication

A scalar is just a regular number (as opposed to a vector). Multiplying a vector by a scalar multiplies every component by that number.

k * v = [k * v₁, k * v₂, k * v₃]

Example: Scalar Multiplication

Problem: Multiply v = [4, -2, 6] by scalar 3

3 * v = [3*4, 3*(-2), 3*6]
3 * v = [12, -6, 18]

CS context: Scaling a velocity vector by 3 makes the object move 3 times faster in the same direction.

Magnitude (Length)

The magnitude (or length or norm) of a vector tells you how long it is. It's the distance from the origin to the point the vector represents.

||v|| = √(v₁² + v₂² + v₃² + ...)

Example: Vector Magnitude

Problem: Find the magnitude of v = [3, 4]

||v|| = √(3² + 4²)
||v|| = √(9 + 16)
||v|| = √25
||v|| = 5

CS context: If v represents a velocity, then ||v|| = 5 is the speed (how fast, ignoring direction).

Unit Vectors

A unit vector is a vector with magnitude 1. It represents a pure direction with no scaling. To make any vector into a unit vector, divide it by its magnitude.

v̂ = v / ||v||

Example: Unit Vector

Problem: Find the unit vector of v = [3, 4]

||v|| = 5 (from previous example)
v̂ = [3/5, 4/5] = [0.6, 0.8]

Check: √(0.6² + 0.8²) = √(0.36 + 0.64) = √1 = 1

CS Tip

Unit vectors come up constantly in game dev and graphics. When you want to move a character "toward the enemy" at a fixed speed, you compute the direction vector (enemy_pos - player_pos), normalize it to a unit vector, then multiply by the speed. Direction times speed equals velocity.

Dot Product

The dot product (or scalar product) takes two vectors and returns a single number. You multiply corresponding components and add them up.

a · b = a₁b₁ + a₂b₂ + a₃b₃ + ...

Example: Dot Product

Problem: Find the dot product of a = [2, 3, -1] and b = [4, -1, 5]

a · b = (2)(4) + (3)(-1) + (-1)(5)
a · b = 8 + (-3) + (-5)
a · b = 0

Note: When the dot product is 0, the vectors are perpendicular (orthogonal).

Geometric Meaning of Dot Product

The dot product tells you how much two vectors point in the same direction. It connects to the angle between them through this formula:

cos(θ) = (a · b) / (||a|| * ||b||)

If dot product > 0: vectors point roughly the same direction (angle < 90 degrees)
If dot product = 0: vectors are perpendicular (angle = 90 degrees)
If dot product < 0: vectors point in roughly opposite directions (angle > 90 degrees)

Example: Angle Between Vectors

Problem: Find the angle between a = [1, 0] and b = [1, 1]

a · b = (1)(1) + (0)(1) = 1
||a|| = √(1 + 0) = 1
||b|| = √(1 + 1) = √2
cos(θ) = 1 / (1 * √2) = 1/√2
θ = 45 degrees

CS Tip

The dot product is everywhere in CS. In lighting calculations, you dot the surface normal with the light direction to figure out how bright a surface is. In recommendation engines, you dot a user preference vector with a product feature vector to predict how much they'll like it. In NLP, cosine similarity (which uses dot product) measures how similar two word embeddings are.

Cross Product (3D Only)

The cross product takes two 3D vectors and returns a new vector that is perpendicular to both inputs. It's mainly used in 3D graphics for computing surface normals.

a × b = [
  a₂b₃ - a₃b₂,
  a₃b₁ - a₁b₃,
  a₁b₂ - a₂b₁
]

Example: Cross Product

Problem: Find the cross product of a = [1, 0, 0] and b = [0, 1, 0]

a × b = [(0)(0) - (0)(1), (0)(0) - (1)(0), (1)(1) - (0)(0)]
a × b = [0, 0, 1]

CS context: The x-axis crossed with the y-axis gives the z-axis. This is how 3D coordinate systems are defined.

Warning

The cross product is not commutative: a × b = -(b × a). The order matters -- it flips the direction of the result. Also, the cross product only works in 3D.

3. Matrices

A matrix is a rectangular grid of numbers arranged in rows and columns. Just as a vector is a list, a matrix is a table. We describe a matrix by its dimensions: an m × n matrix has m rows and n columns.

A (2×3 matrix) =

| 1 2 3 |
| 4 5 6 |

2 rows, 3 columns

Matrices show up everywhere in CS: an image is a matrix of pixel values, a spreadsheet is a matrix, a neural network layer is defined by a weight matrix, and a 3D transformation is a 4×4 matrix.

Matrix Addition and Subtraction

Just like vectors, you add/subtract matrices element by element. Both matrices must have the same dimensions.

Example: Matrix Addition

Problem: Add matrices A and B

A = | 1 2 |    B = | 5 6 |
    | 3 4 |       | 7 8 |

A + B = | 1+5 2+6 | = | 6    8 |
        | 3+7 4+8 |   | 10 12 |

Scalar Multiplication

Multiply every element in the matrix by the scalar.

Example: Scalar Multiplication

3 * | 1 2 | = | 3 6 |
| 4 5 | | 12 15 |

Matrix Multiplication

This is the big one. Matrix multiplication is not element-wise. It uses a "row times column" pattern. Each entry in the result is the dot product of a row from the first matrix with a column from the second matrix.

If A is m×n and B is n×p, then AB is m×p

(AB)_ij = Row i of A · Column j of B

Key requirement: number of columns in A must equal number of rows in B

Critical Rule

Matrix multiplication requires the inner dimensions to match. A (2×3) matrix can multiply a (3×4) matrix, giving a (2×4) result. But a (2×3) CANNOT multiply a (2×4) because 3 does not equal 2.

Example: Matrix Multiplication (Step by Step)

Problem: Multiply A (2×3) by B (3×2)

A = | 1 2 3 |    B = | 7    8 |
    | 4 5 6 |       | 9    10 |
                       | 11 12 |

Result will be 2×2.

Position (1,1): Row 1 of A · Col 1 of B
= (1)(7) + (2)(9) + (3)(11) = 7 + 18 + 33 = 58

Position (1,2): Row 1 of A · Col 2 of B
= (1)(8) + (2)(10) + (3)(12) = 8 + 20 + 36 = 64

Position (2,1): Row 2 of A · Col 1 of B
= (4)(7) + (5)(9) + (6)(11) = 28 + 45 + 66 = 139

Position (2,2): Row 2 of A · Col 2 of B
= (4)(8) + (5)(10) + (6)(12) = 32 + 50 + 72 = 154

AB = | 58    64 |
     | 139 154 |

Warning: Not Commutative

Matrix multiplication is NOT commutative: AB does not equal BA. In fact, even if AB is defined, BA might not be (because the dimensions might not match the other way). This matters in graphics -- applying rotation then translation gives a different result than translation then rotation.

Identity Matrix

The identity matrix (I) is the matrix equivalent of the number 1. It's a square matrix with 1s on the diagonal and 0s everywhere else. Multiplying any matrix by I leaves it unchanged.

I (3×3) =

| 1 0 0 |
| 0 1 0 |
| 0 0 1 |

A * I = A and I * A = A

CS Tip

In 3D graphics, the identity matrix represents "no transformation." When you reset a model's transform, you set it to the identity matrix. Every transformation starts from identity.

Transpose

The transpose of a matrix (written A^T) flips rows and columns. Row 1 becomes column 1, row 2 becomes column 2, and so on. An m×n matrix becomes n×m.

Example: Transpose

A = | 1 2 3 |     A^T = | 1 4 |
    | 4 5 6 |           | 2 5 |
                          | 3 6 |

The 2×3 matrix became a 3×2 matrix.

Example: Complete Matrix Operations

Problem: Given A = | 2 1 | and B = | 0 3 |, find 2A - B

| 3 4 | | 1 2 |

Step 1: Compute 2A
2A = | 4 2 |
| 6 8 |

Step 2: Subtract B
2A - B = | 4-0 2-3 | = | 4 -1 |
| 6-1 8-2 | | 5 6 |

4. Matrix Operations for CS

Systems of Equations as Ax = b

One of the most powerful uses of matrices is representing and solving systems of linear equations. Instead of writing out multiple equations, you pack everything into a single matrix equation.

Example: System to Matrix Form

System of equations:

2x + 3y = 7
4x - y = 1

Matrix form Ax = b:

| 2 3 | | x | | 7 |
| 4 -1 | * | y | = | 1 |

A = coefficient matrix, x = unknowns vector, b = constants vector

Why This Matters

This isn't just a notation trick. Phrasing problems as Ax = b lets you use matrix operations to solve them -- and computers are incredibly fast at matrix operations. This is how physics simulators solve thousands of equations simultaneously, and how machine learning models find optimal parameters.

Gaussian Elimination (Row Reduction)

Gaussian elimination is a systematic method for solving systems of equations by transforming the matrix into a simpler form. You create an augmented matrix (A with b appended) and perform row operations until the solution is clear.

The three allowed row operations are:

Swap two rows
Multiply a row by a nonzero constant
Add a multiple of one row to another row

Example: Gaussian Elimination (Step by Step)

Problem: Solve the system

x + y + z = 6
2x + 3y + z = 14
x + y + 2z = 9

Step 1: Write the augmented matrix

| 1 1 1 | 6 |
| 2 3 1 | 14 |
| 1 1 2 | 9 |

Step 2: R2 = R2 - 2*R1 (eliminate x from row 2)

| 1 1 1 | 6 |
| 0 1 -1 | 2 |
| 1 1 2 | 9 |

Step 3: R3 = R3 - R1 (eliminate x from row 3)

| 1 1 1 | 6 |
| 0 1 -1 | 2 |
| 0 0 1 | 3 |

Step 4: Back-substitute from the bottom

Row 3: z = 3
Row 2: y - z = 2 → y = 2 + 3 = 5
Row 1: x + y + z = 6 → x = 6 - 5 - 3 = -2

Solution: x = -2, y = 5, z = 3

Verify: (-2) + 5 + 3 = 6, 2(-2) + 3(5) + 3 = 14, (-2) + 5 + 2(3) = 9

Row Echelon Form

The goal of Gaussian elimination is to reach row echelon form, where:

All zero rows are at the bottom
The first nonzero entry in each row (called the pivot) is to the right of the pivot in the row above
The matrix forms a "staircase" pattern

Row Echelon Form (staircase pattern):

| 1 * * * |
| 0 1 * * |
| 0 0 1 * |
| 0 0 0 0 |

(* = any number, bold = pivots)

CS Tip

In practice, you rarely hand-compute Gaussian elimination. Libraries like NumPy handle it. But understanding the process is essential for debugging numerical issues, understanding computational complexity (it's O(n³)), and knowing when a system has no solution or infinite solutions.

5. Determinant

The determinant is a single number computed from a square matrix that tells you important things about the matrix. Think of it as a measure of how much the matrix "stretches" or "squishes" space.

2×2 Determinant

For A = | a b |
| c d |

det(A) = ad - bc

Example: 2×2 Determinant

Problem: Find det(A) where A = | 3 2 |

| 1 4 |

det(A) = (3)(4) - (2)(1)
det(A) = 12 - 2
det(A) = 10

What the Determinant Means

Geometrically, the determinant represents the scaling factor for area (2D) or volume (3D) when you apply the matrix as a transformation.

|det| > 1: The transformation expands area/volume
|det| = 1: Area/volume is preserved (like rotation)
0 < |det| < 1: The transformation shrinks area/volume
det = 0: The transformation collapses a dimension (the matrix is singular)
det < 0: The transformation flips orientation (like a reflection)

Example: Determinant of a Scaling Matrix

Scaling by 2 in x and 3 in y:
A = | 2 0 |
| 0 3 |

det(A) = (2)(3) - (0)(0) = 6

A 1×1 unit square becomes a 2×3 rectangle. Area goes from 1 to 6. The determinant is 6.

Example: Singular Matrix (det = 0)

A = | 2 4 |
| 1 2 |

det(A) = (2)(2) - (4)(1) = 4 - 4 = 0

This matrix has no inverse. Row 1 is just 2 times row 2 -- the rows are "linearly dependent." The transformation collapses 2D space into a line.

3×3 Determinant

For larger matrices, you expand along a row or column (cofactor expansion). For a 3×3 matrix:

For A = | a b c |
| d e f |
| g h i |

det(A) = a(ei - fh) - b(di - fg) + c(dh - eg)

Example: 3×3 Determinant

A = | 1 2 3 |
    | 4 5 6 |
    | 7 8 0 |

det(A) = 1(5*0 - 6*8) - 2(4*0 - 6*7) + 3(4*8 - 5*7)
       = 1(0 - 48) - 2(0 - 42) + 3(32 - 35)
       = -48 + 84 + (-9)
       = 27

Warning

Computing determinants by cofactor expansion gets extremely expensive for large matrices -- it's O(n!) in the naive approach. Real libraries use LU decomposition (based on Gaussian elimination) which is O(n³). You should understand what determinants mean, but let NumPy compute them.

6. Inverse Matrix

The inverse of a matrix A, written A^-1, is the matrix that "undoes" A. When you multiply A by its inverse, you get the identity matrix.

A * A^-1 = I and A^-1 * A = I

Think of it like division for matrices. If multiplication by A transforms space in some way, multiplication by A^-1 reverses that transformation exactly.

When Does the Inverse Exist?

A matrix has an inverse if and only if its determinant is not zero. If det(A) = 0, the matrix is called singular and has no inverse. This makes intuitive sense: if A collapses a dimension (det = 0), you can't recover the lost information.

2×2 Inverse Formula

For A = | a b |    with det(A) = ad - bc ≠ 0
        | c d |

A^-1 = (1 / det(A)) * | d -b |
                      | -c    a |

Steps: swap a and d, negate b and c, divide by determinant

Example: Finding a 2×2 Inverse

Problem: Find A^-1 where A = | 4 7 |
| 2 6 |

Step 1: det(A) = (4)(6) - (7)(2) = 24 - 14 = 10

Step 2: Apply formula
A^-1 = (1/10) * | 6 -7 |
| -2 4 |

A^-1 = | 0.6 -0.7 |
| -0.2 0.4 |

Verify: A * A^-1 should equal I
| 4 7 | * | 0.6 -0.7 | = | 4(0.6)+7(-0.2)    4(-0.7)+7(0.4) |
| 2 6 |   | -0.2    0.4 |   | 2(0.6)+6(-0.2)    2(-0.7)+6(0.4) |

= | 2.4-1.4 -2.8+2.8 | = | 1 0 |
  | 1.2-1.2 -1.4+2.4 |   | 0 1 |

Using the Inverse to Solve Systems

If you have the equation Ax = b and you know A^-1, you can solve for x directly:

Ax = b
A^-1Ax = A^-1b
Ix = A^-1b
x = A^-1b

Example: Solving with Inverse

Problem: Solve using the inverse from above

4x + 7y = 5
2x + 6y = 4

x = A^-1b = | 0.6 -0.7 | * | 5 |
| -0.2 0.4 | | 4 |

x = | 0.6(5) + (-0.7)(4) | = | 3 - 2.8 | = | 0.2 |
| -0.2(5) + 0.4(4) | | -1 + 1.6 | | 0.6 |

Solution: x = 0.2, y = 0.6

CS Tip

In practice, solving Ax = b by computing A^-1 explicitly is inefficient and numerically unstable. Real code uses LU decomposition or other factorization methods. In NumPy, use np.linalg.solve(A, b) instead of np.linalg.inv(A) @ b. But the inverse concept is still essential for understanding the theory.

7. Eigenvalues and Eigenvectors

This is the concept that makes most students panic, but the idea is surprisingly simple. When you multiply most vectors by a matrix, they change both direction and magnitude. But some special vectors only get scaled -- they keep pointing in the same direction. These are eigenvectors, and the scaling factor is the eigenvalue.

Av = λv

A = the matrix
v = eigenvector (direction doesn't change)
λ = eigenvalue (the scaling factor)

In words: "When I apply transformation A to vector v, I get back the same vector v, just scaled by λ."

How to Find Eigenvalues

Starting from Av = λv, we rearrange:

Av = λv
Av - λv = 0
(A - λI)v = 0

For nonzero v to exist: det(A - λI) = 0

This is called the characteristic equation. Solving it gives you the eigenvalues (λ values).

Example: Finding Eigenvalues and Eigenvectors

Problem: Find eigenvalues and eigenvectors of

A = | 4 1 |
| 2 3 |

Step 1: Set up A - λI

A - λI = | 4-λ 1 |
| 2 3-λ |

Step 2: Set determinant to zero

det(A - λI) = (4-λ)(3-λ) - (1)(2) = 0
12 - 4λ - 3λ + λ² - 2 = 0
λ² - 7λ + 10 = 0
(λ - 5)(λ - 2) = 0

Eigenvalues: λ₁ = 5, λ₂ = 2

Step 3: Find eigenvectors for each λ

For λ₁ = 5: Solve (A - 5I)v = 0
| -1 1 | * | v₁ | = | 0 |
| 2 -2 | | v₂ | | 0 |

-v₁ + v₂ = 0 → v₂ = v₁
Eigenvector: v₁ = [1, 1] (or any scalar multiple)

For λ₂ = 2: Solve (A - 2I)v = 0
| 2 1 | * | v₁ | = | 0 |
| 2 1 | | v₂ | | 0 |

2v₁ + v₂ = 0 → v₂ = -2v₁
Eigenvector: v₂ = [1, -2] (or any scalar multiple)

Verification for λ₁ = 5, v = [1, 1]:

Av = | 4 1 | * | 1 | = | 5 | = 5 * | 1 | = λv
| 2 3 | | 1 | | 5 | | 1 |

Why CS Cares About Eigenvalues

Application	How Eigenvalues/Eigenvectors Are Used
Google PageRank	Web pages are nodes in a giant matrix. The principal eigenvector of the link matrix gives page importance rankings.
PCA (Data Science)	Eigenvectors of the covariance matrix point in the directions of greatest variance. This lets you reduce dimensionality while keeping the most important patterns.
Stability Analysis	If all eigenvalues of a system's matrix have magnitude < 1, the system is stable. Used in control systems and simulation.
Image Compression	SVD (closely related to eigendecomposition) lets you approximate images using only the most significant components.
Graph Algorithms	Eigenvalues of the adjacency matrix reveal graph properties like connectivity and clustering structure (spectral graph theory).

Intuition

Think of eigenvectors as the "natural axes" of a transformation. A matrix might do complicated things to most vectors, but along its eigenvectors, the action is simple: just stretching. Eigenvalues tell you how much stretching happens along each axis. This is why they simplify so many problems -- they reveal the matrix's fundamental behavior.

8. Linear Transformations

Every matrix represents a linear transformation -- a function that takes vectors in, moves/stretches/rotates them, and outputs new vectors. When you multiply a vector by a matrix, you're applying that transformation. This is the connection between abstract matrix math and real visual/spatial effects.

Linear Transformation — Formal Definition (The Two Axioms):

A function T is a linear transformation if and only if:

1. Additivity: T(u + v) = T(u) + T(v) for all vectors u, v
2. Homogeneity: T(cu) = c·T(u) for all vectors u and scalars c

If a transformation satisfies these two rules, it can always be represented as a matrix multiplication T(x) = Ax.

Common 2D Transformation Matrices

Transformation	Matrix	Effect
Rotation by θ	\| cosθ -sinθ \| \| sinθ cosθ \|	Rotates all points by angle θ around the origin
Scaling	\| s_x 0 \| \| 0 s_y \|	Stretches x by s_x and y by s_y
Reflection (x-axis)	\| 1 0 \| \| 0 -1 \|	Flips vertically (negates y)
Reflection (y-axis)	\| -1 0 \| \| 0 1 \|	Flips horizontally (negates x)
Shear (horizontal)	\| 1 k \| \| 0 1 \|	Slants by factor k along x-axis

Example: 90-Degree Rotation

Problem: Rotate the point [3, 1] by 90 degrees counterclockwise

θ = 90 degrees, so cos(90) = 0, sin(90) = 1

R = | 0 -1 |
| 1 0 |

R * [3, 1] = | 0*3 + (-1)*1 | = | -1 |
| 1*3 + 0*1 | | 3 |

The point [3, 1] rotated to [-1, 3]. You can verify this is correct: the distance from origin is preserved (√10 in both cases), and the angle increased by 90 degrees.

Example: Scaling Transformation

Problem: Scale point [2, 3] by 2x horizontally and 0.5x vertically

S = | 2    0    |
    | 0    0.5 |

S * [2, 3] = | 2*2 + 0*3    | = | 4    |
             | 0*2 + 0.5*3 |   | 1.5 |

The x-coordinate doubled and the y-coordinate halved.

Composition = Matrix Multiplication

Here's the elegant part: applying two transformations in sequence is the same as multiplying their matrices together. If you want to first scale, then rotate, you compute R * S and use the resulting matrix.

Apply S first, then R:
result = R * (S * v) = (R * S) * v

The combined transformation matrix is: T = R * S

Order matters! R*S (scale then rotate) is different from S*R (rotate then scale)

Warning

When composing transformations, read right to left. In T = R * S, the vector first gets multiplied by S (scaling), then the result gets multiplied by R (rotation). The rightmost matrix acts first. This trips up everyone at first.

CS Tip

In 3D graphics, objects go through a chain of transformations: Model (position in world) → View (camera perspective) → Projection (3D to 2D screen). The MVP matrix (Model-View-Projection) is the product of all three. GPUs compute millions of these matrix multiplications per frame -- it's what they're built for.

9. Practical CS Applications

Neural Networks

A neural network layer is fundamentally a matrix multiplication followed by an activation function. The weights between layers form a matrix. The forward pass (computing the output) is just repeated matrix-vector multiplication.

Layer output = activation(W * x + b)

W = weight matrix (learned during training)
x = input vector
b = bias vector
activation = nonlinear function (ReLU, sigmoid, etc.)

Training adjusts the weights using gradients (vectors of partial derivatives), and the gradient computation involves matrix transposes and chain-rule multiplications. The entire deep learning stack is linear algebra plus calculus.

Image Processing

A grayscale image is literally a matrix where each entry is a pixel brightness (0-255). Color images are three matrices stacked (R, G, B channels). Convolution -- the core operation in image filters and CNNs -- slides a small matrix (the kernel) across the image and computes dot products at each position.

Edge detection kernel:    Blur kernel:

| -1 -1 -1 |        | 1/9 1/9 1/9 |
| -1    8 -1 |        | 1/9 1/9 1/9 |
| -1 -1 -1 |        | 1/9 1/9 1/9 |

Recommendation Systems

Netflix, Spotify, and Amazon use matrix factorization for recommendations. You have a giant matrix of users × items, mostly empty (users have only rated a few things). The trick: decompose this into two smaller matrices -- one capturing user preferences, one capturing item features. The product approximates the full matrix, filling in the blanks with predictions.

R ≈ U * V^T

R = ratings matrix (users × items, mostly unknown)
U = user matrix (users × features)
V = item matrix (items × features)

Predicted rating for user i, item j = row i of U · row j of V

3D Graphics Pipeline

Every vertex in a 3D scene goes through a chain of 4×4 matrix transformations. Using 4×4 matrices (instead of 3×3) through a technique called homogeneous coordinates lets you represent translation as a matrix multiplication too.

final_position = Projection * View * Model * vertex_position

Model: object's position, rotation, scale in the world
View: where the camera is and which direction it's looking
Projection: converts 3D to 2D (perspective or orthographic)

Python / NumPy Code Example

Pythonimport numpy as np

# --- Vectors ---
a = np.array([2, 3, -1])
b = np.array([4, -1, 5])

print("Addition:", a + b)          # [6, 2, 4]
print("Dot product:", np.dot(a, b)) # 0 (perpendicular!)
print("Magnitude:", np.linalg.norm(a)) # 3.742

# --- Matrices ---
A = np.array([[1, 2],
              [3, 4]])
B = np.array([[5, 6],
              [7, 8]])

print("Matrix multiply:", A @ B)  # @ is matrix multiply
print("Transpose:", A.T)
print("Determinant:", np.linalg.det(A))  # -2.0
print("Inverse:", np.linalg.inv(A))

# --- Solving Ax = b ---
A = np.array([[2, 3],
              [4, -1]])
b = np.array([7, 1])

x = np.linalg.solve(A, b)  # Better than inv(A) @ b
print("Solution:", x)  # [1. 1.667]

# --- Eigenvalues ---
A = np.array([[4, 1],
              [2, 3]])
eigenvalues, eigenvectors = np.linalg.eig(A)
print("Eigenvalues:", eigenvalues)    # [5. 2.]
print("Eigenvectors:", eigenvectors)  # columns are eigenvectors

# --- 2D Rotation ---
import math
theta = math.radians(90)
R = np.array([[math.cos(theta), -math.sin(theta)],
              [math.sin(theta),  math.cos(theta)]])
point = np.array([3, 1])
print("Rotated:", R @ point)  # [-1, 3]

CS Tip

NumPy's @ operator is matrix multiplication (same as np.matmul). The * operator on arrays is element-wise multiplication, which is NOT the same thing. This distinction trips up beginners constantly. Use @ for matrix math, * for element-wise operations.

10. Practice Quiz

Test your understanding of the key linear algebra concepts. Click the answer you think is correct.

Q1: What is the dot product of a = [1, 3, -2] and b = [4, -1, 5]?

Answer: C) -9
a · b = (1)(4) + (3)(-1) + (-2)(5) = 4 - 3 - 10 = -9.
Option A is wrong because the dot product returns a single number, not a vector. The dot product is a sum of element-wise products.

Q2: If A is a 3×4 matrix and B is a 4×2 matrix, what is the size of AB?

Answer: A) 3×2
Matrix multiplication (m×n)(n×p) = (m×p). Here (3×4)(4×2) = 3×2. The inner dimensions (4 and 4) must match (they do), and you take the outer dimensions for the result.

Q3: What does a determinant of 0 tell you about a matrix?

Answer: C) The matrix has no inverse (singular)
When det(A) = 0, the matrix collapses at least one dimension (it squishes space to a lower dimension). This means the transformation loses information and cannot be reversed. Therefore, A^-1 does not exist.

Q4: In the equation Av = λv, what is λ?

Answer: B) An eigenvalue
In Av = λv, when you apply matrix A to eigenvector v, the vector keeps its direction but gets scaled by the factor λ. This scalar λ is called the eigenvalue. Each eigenvalue has a corresponding eigenvector (or family of eigenvectors).

Q5: In a neural network, what role does a weight matrix W play in the equation output = activation(Wx + b)?

Answer: B) It applies a linear transformation to the input vector
The weight matrix W transforms the input x through matrix-vector multiplication. This is a linear transformation -- it rotates, scales, and projects the input into a new space. The bias b shifts the result, and the activation function adds the nonlinearity that lets neural networks learn complex patterns. Training adjusts W so this transformation becomes useful for the task.

Table of Contents

1. What is Linear Algebra?

Why CS Needs Linear Algebra

2. Vectors

Vector Addition and Subtraction

Scalar Multiplication

Magnitude (Length)

Unit Vectors

Dot Product

Geometric Meaning of Dot Product

Cross Product (3D Only)

3. Matrices

Matrix Addition and Subtraction

Scalar Multiplication

Matrix Multiplication

Identity Matrix

Transpose

4. Matrix Operations for CS

Systems of Equations as Ax = b

Gaussian Elimination (Row Reduction)

Row Echelon Form

5. Determinant

2×2 Determinant

What the Determinant Means

3×3 Determinant

6. Inverse Matrix

When Does the Inverse Exist?

2×2 Inverse Formula

Using the Inverse to Solve Systems

7. Eigenvalues and Eigenvectors

How to Find Eigenvalues

Why CS Cares About Eigenvalues

8. Linear Transformations

Common 2D Transformation Matrices

Composition = Matrix Multiplication

9. Practical CS Applications

Neural Networks

Image Processing

Recommendation Systems

3D Graphics Pipeline

Python / NumPy Code Example

10. Practice Quiz

Q1: What is the dot product of a = [1, 3, -2] and b = [4, -1, 5]?

Q2: If A is a 3×4 matrix and B is a 4×2 matrix, what is the size of AB?

Q3: What does a determinant of 0 tell you about a matrix?

Q4: In the equation Av = λv, what is λ?

Q5: In a neural network, what role does a weight matrix W play in the equation output = activation(Wx + b)?