Covariance
What is Covariance?
Covariance is a measure of how two variables change together. It is a measure of the linear relationship between two variables. It is given by:
\[\text{Cov}(X,Y) = \frac{1}{N} \sum_{i=1}^{N} (x_i - \bar{x})(y_i - \bar{y}) = E[(X-E(X))(Y-E(Y))]\]where \(\bar{x}\) and \(\bar{y}\) are the means of \(X\) and \(Y\) respectively. The covariance is positive if \(X\) and \(Y\) are positively correlated, negative if \(X\) and \(Y\) are negatively correlated, and zero if \(X\) and \(Y\) are uncorrelated.
For sample covariance, the denominator is \(N-1\) instead of \(N\). This is because the sample mean is used instead of the population mean, and the sample mean is calculated using \(N-1\) instead of \(N\).
\[\text{Cov}(X,Y) = \frac{1}{N-1} \sum_{i=1}^{N} (x_i - \bar{x})(y_i - \bar{y})\]The covariance between a variable and itself is the variance of that variable.
\[\text{Cov}(X,X) = \frac{1}{N} \sum_{i=1}^{N} (x_i - \bar{x})^2 = \sigma_X^2 = E[(X-E(X))^2] = E(X^2) - E^2(X) = \text{Var}(X)\]Correlation
Correlation is a measure of the strength of the linear relationship between two variables. It is given by:
\[\text{Corr}(X,Y) = \frac{\text{Cov}(X,Y)}{\sigma_X \sigma_Y} = \frac{\text{Cov}(X,Y)}{\sqrt{\text{Cov}(X,X)} \sqrt{\text{Cov}(Y,Y)}}\]where \(\sigma_X\) and \(\sigma_Y\) are the standard deviations of \(X\) and \(Y\) respectively. The correlation is positive if \(X\) and \(Y\) are positively correlated, negative if \(X\) and \(Y\) are negatively correlated, and zero if \(X\) and \(Y\) are uncorrelated.
Covariance Matrix
Sometimes the variables \(X\) is not a single value but a vector of values \(X = (X_1, X_2, \dots, X_n)\). In this case, the covariance matrix is used to describe the covariance between the different dimensions of \(X\). The covariance matrix is given by:
\[\text{Cov}(X) = \begin{bmatrix} \text{Cov}(X_1, X_1) & \text{Cov}(X_1, X_2) & \dots & \text{Cov}(X_1, X_n) \\ \text{Cov}(X_2, X_1) & \text{Cov}(X_2, X_2) & \dots & \text{Cov}(X_2, X_n) \\ \vdots & \vdots & \ddots & \vdots \\ \text{Cov}(X_n, X_1) & \text{Cov}(X_n, X_2) & \dots & \text{Cov}(X_n, X_n) \end{bmatrix}\]The covariance matrix is symmetric, and the diagonal elements are the variances of the corresponding dimensions of \(X\). Let \(\mu\) be the mean vector of \(X\), then the covariance matrix can be written as:
\[\text{Cov}(X) = \frac{1}{N} \sum_{i=1}^{N} (X_i - \mu)(X_i - \mu)^T\]Properties of Covariance
Linearity
\[\begin{aligned} \text{Cov}(X+Y,Z) &= \frac{1}{N} \sum_{i=1}^{N} [(x_i + y_i) - (\bar{x} + \bar{y})] (z_i - \bar{z}) \\ &= \frac{1}{N} \sum_{i=1}^{N} [(x_i - \bar{x}) + (y_i - \bar{y})] (z_i - \bar{z}) \\ &= \frac{1}{N} \sum_{i=1}^{N} (x_i - \bar{x})(z_i - \bar{z}) + \frac{1}{N} \sum_{i=1}^{N} (y_i - \bar{y})(z_i - \bar{z}) \\ &= \text{Cov}(X,Z) + \text{Cov}(Y,Z) \end{aligned}\] \[\begin{aligned} \text{Cov}(aX,Y) &= \frac{1}{N} \sum_{i=1}^{N} [a(x_i - \bar{x})] (y_i - \bar{y}) \\ &= a \frac{1}{N} \sum_{i=1}^{N} (x_i - \bar{x})(y_i - \bar{y}) \\ &= a \text{Cov}(X,Y) \end{aligned}\]semi-definite positive
The covariance matrix \(\Sigma\) is semi-definite positive. This means that for any vector \(v\), the following inequality holds:
\[v^T \Sigma v \geq 0\]This can be shown as follows:
\[\begin{aligned} v^T \Sigma v &= v^T \mathbb{E}[(X-\mu)(X-\mu)^T] v \\ &= \mathbb{E}[v^T(X-\mu)(X-\mu)^T v] \\ &= \mathbb{E}[((X-\mu)^T v)^T ((X-\mu)^T v)] \\ &= \mathbb{E}[\vert\vert (X-\mu)^T v \vert\vert^2] \geq 0 \end{aligned}\]diagonality
Since the covariance matrix is symmetric, it can be diagonalized. This means that there exists a matrix \(P\) such that:
\[P^T \Sigma P = D\]where \(D\) is a diagonal matrix. The diagonal elements of \(D\) are the eigenvalues of \(\Sigma\), and the columns of \(P\) are the corresponding eigenvectors.