Dimensionality Reduction via JL Lemma and Random Projection
Nowadays, dimensionality is a serious problem of data analysis as the huge data we experience today results in very sparse sets and very high dimensions. Although, data scientists have long used tools such as principal component analysis (PCA) and independent component analysis (ICA) to project the highdimensional data onto a subspace, but all those techniques reply on the computation of the eigenvectors of a $n \times n$ matrix, a very expensive operation (e.g., spectral decomposition) for high dimension $n$. Moreover, even though eigenspace has many important properties, it does not lead good approximations for many useful measures such as vector norms. We discuss another method random projection to reduce dimensionality.
In 1984, two mathematicians introduced and proved the following lemma.
JohnsonLindenstrauss lemma
For any $\epsilon \in (0,\frac{1}{2})$, $\forall x_1, x_2, \dots, x_d \in \mathbb{R}^{n}$, there exists a matrix $M \in \mathbb{R}^{m \times n}$ with $m = O(\frac{1}{\epsilon^2} \log{d})$ such that $\forall 1 \leq i,j \leq d$, we have
Remark: This lemma states that for any pair vector $x_i, x_j$ in $\mathbb{R}^n$ dimension, there exist a sketch matrix $M$ which maps $\mathbb{R}^n \rightarrow \mathbb{R}^m$ and the Euclidean distance is preserved within $\epsilon$ factor. The result dimension does not relate to origin dimension $n$ (only relates to the number of vector pairs $d$).
During a long time, no one can figure out how to get this sketch matrix.
Random Projection
Until 2003, some researches point out that this sketch matrix can be created using Gaussian distribution.
Consider the following matrix $A \in \mathbb{R}^{m \times n}$, where $A_{ij} \sim \mathcal{N}(0,1)$ and all $A_{ij}$ are independent. We claim that this matrix satisfies the statement of JL lemma.
Proof. It is obvious that sketch has an additional property, $\forall i, (Ax)_i = \sum_{j=1}^{n} A_{ij} x_j \sim \mathcal{N}(0, x_2^2)$. In other word, Gaussian distribution is 2stable distribution. Then we can obtain $Ax_2^2 = \sum_{i=1}^{m} y_i^2$, where $y_i \sim \mathcal{N}(0, x_2^2)$. That is to say, $Ax_2^2$ follows a $\chi^2$ (chisquared) distribution with degrees of freedom $m$. For tail bound of $\chi^2$ distribution, we can get
for a constant $C > 0$.
Fix two index $i, j$, and let $y^{ij} = x_i  x_j$ and $M = \frac{1}{\sqrt{m}} A$, and set $m = \frac{4}{C \epsilon^2} \log{n}$ to get
Take the union bound to obtain,
Thus,
which is same as the guarantee in JohnsonLindenstrauss lemma.
Application
In this or other forms, the JL lemma has been used for a large variety of computational tasks, especially in streaming algorithm, such as

Computing a lowrank approximation to the original matrix A.

Simplify the calculation of the effective resistance in Graph Spectral Sparsification.
Reference
 EPFL Topics in Theoretical Computer Science (Sublinear Algorithm for Big Data Analysis), 2017
 EPFL Advanced Algorithm, 2016
 Johnson, William B.; Lindenstrauss, Joram (1984). “Extensions of Lipschitz mappings into a Hilbert space”. In Beals, Richard; Beck, Anatole; Bellow, Alexandra; et al. Conference in modern analysis and probability (New Haven, Conn., 1982). Contemporary Mathematics. 26. Providence, RI: American Mathematical Society. pp. 189–206.
 Kane, Daniel M.; Nelson, Jelani (2012). “Sparser JohnsonLindenstrauss Transforms”. Proceedings of the TwentyThird Annual ACMSIAM Symposium on Discrete Algorithms,. New York: Association for Computing Machinery (ACM).
The post is used for study purpose only.