Low-Rank Decomposition

# Low-Rank Decomposition Parent: [[Low-Rank Decomposition & Matrix Factorisation]] The idea that a large weight matrix W can be approximated as the product of two smaller matrices A and B, where the inner dimension r is much smaller than the original dimensions. Instead of storing and multiplying a d×d matrix with d² parameters, you store d×r + r×d = 2dr parameters. If r is 10% of d, you have just compressed the layer by 5x. The approximation is only useful when the matrix has low effective rank — when its information content genuinely lies in a smaller subspace than its nominal size suggests. In practice, most neural network weight matrices do. They were trained by gradient descent on finite data, and the directions of variation they actually need are far fewer than the directions they could in principle represent. Decomposition can be applied in three places: at initialisation (bake the constraint in), during training (learn the factors directly), or post-hoc on a trained model (SVD the weights, keep the top-r components, retrain a bit to recover). Post-hoc is the easiest, training-time is the most powerful, and init-time is the approach LoRA takes for adapters. The failure mode is picking a rank too low for the task. Accuracy drops cleanly for a while, then cliffs. Finding the inflection point is the whole game. ## Related - [[Singular Value Decomposition (SVD)]] - [[LoRA (Low-Rank Adaptation)]] - [[Effective Rank]] - [[Model Compression Fundamentals]] --- Tags: #ai #compression #linalg #kp