Mathematics is the core of many algorithms and intuitive ideas. The following are the main topics used in computer science with a slight bias toward machine learning fundamentals.
Linear Algebra
Linear algebra is the study of vectors and matrices and their manipulations.
- ๐น Vectors are objects that are closed under clearly defined addition and scalar multiplication operations.
- A ๐ฑ Matrix is a two-dimensional tuple with entries; important properties of a matrix include the ๐ Determinant and ๐๏ธ Trace as well as its ๐ Eigenvalues.
Using vectors and matrices, we can solve โ๏ธ System of Linear Equations and model ๐บ๏ธ Linear Mappings.
Sometimes, a problem requires a matrix to be factorized.
- ๐ฅง Cholesky Decomposition converts a PSD matrix into two triangular matrices.
- ๐ชท Eigendecomposition shows that a non-defective square matrix is similar to a diagonal matrix.
- ๐ Singular Value Decomposition decomposes a matrix into the product of two orthonormal matrices and one diagonal matrix with singular values.
Geometry
Geometry is closely tied with vectors and matrices from linear algebra, and we can use it to interpret them from a new point of view.
- ๐ณ Inner Products map pairs of vectors to a number, and ๐ Norms represent the size of vectors and matrices.
- With them, we can compute ๐ Distances, ๐ Angles between vectors, and ๐ฝ๏ธ Projections of vectors onto subspaces.
- ๐ชฉ Rotations can also be defined by matrices as linear mappings.
Calculus
Calculus describes the shape of functions in detail and is crucial for optimization and approximations.
- ๐ง Derivatives find the tangent slope in univariate functions, and โ๏ธ Gradients generalize them to multivariate functions.
- The ๐ค Taylor Series is an important method for approximating any differentiable function as a polynomial.
Optimization
Using calculus, we can solve general minimization problems.
- ๐ Unconstrained Optimization uses gradient descent to walk down convex objectives.
- ๐ Constrained Optimization applies the augmented Lagrangian to represent a problem in dual form.
Probability Theory
Probability theory measures the likelihood of events and outcomes.
๐ช Random Variables measure some quantitative value over ๐ฒ Probability Distributions.
- ๐บ๐ธ Independence between two variables is an important property that leads to further conclusions.
- ๐ Summary Statistics explains important properties of a random variableโs distribution.
Among all classes of probability distributions, the ๐จโ๐ฉโ๐งโ๐ฆ Exponential Family is commonly used due to its simplicity and computational properties. The most common member is the ๐ Gaussian.
Generalizing to any distribution, there are several crucial theorems and conclusions.
- ๐ช Bayesโ Theorem relates the posterior, prior, likelihood, and evidence.
- ๐ Jensenโs Inequality measures the effect of applying a convex function to a random variable.
- ๐งฌ Evidence Lower Bound provides a tractable lower bound on an intractable likelihood.
Lastly, information theory is a subfield that analyzes the entropy of distributions.
- ๐ฅ Entropy measures the level of uncertainty in a distribution.
- ๐ง Cross Entropy generalizes entropy to compute across two distributions.
- ๐ฐ Information Gain measures the change in entropy after gaining new โinformation.โ
- ๐ค Mutual Information measures the degree of shared โinformationโ between two variables.
- โ๏ธ KL Divergence and, more generally, ๐ชญ F-Divergence measure the difference between two distributions.