# Singular Value Decomposition (SVD)
Parent: [[Low-Rank Decomposition & Matrix Factorisation]]
The canonical way to factor any matrix W into three pieces: W = UΣVᵀ, where U and V are orthogonal and Σ is diagonal with non-negative entries called singular values. The singular values are sorted in descending order and they tell you how much "energy" each direction carries.
The magic of SVD for compression: if you keep only the top r singular values (and the corresponding columns of U and rows of Vᵀ), you get the best possible rank-r approximation of W in the least-squares sense. This is the Eckart–Young theorem, and it is the theoretical floor for any low-rank compression scheme. Every other decomposition — Tucker, CP, LoRA — is either a special case, a tensor generalisation, or a learned approximation of what SVD gives you exactly.
In neural network compression, SVD shows up in two places. First, as a diagnostic: compute the singular values of each weight matrix, plot them, and see where they decay. If the top 10% carry 95% of the energy, the layer is a good compression candidate. Second, as a direct method: replace W with its rank-r approximation and fine-tune briefly to recover any lost accuracy.
SVD is O(d³) and becomes expensive on very large matrices, which is why approximate and randomised SVD variants exist. For inference-time compression, you only pay the cost once.
## Related
- [[Low-Rank Decomposition]]
- [[Effective Rank]]
- [[LoRA (Low-Rank Adaptation)]]
---
Tags: #ai #linalg #compression #kp