An Introduction to Word Embeddings - Part 2: Problems and Theory

2017-12-14

In the previous post, we introduced what word embeddings are and what they can do. This time, we’ll try to make sense of them. What problem do they solve? How can they help computers understand natural language?

An Introduction to Word Embeddings - Part 1: Applications

By Aaron Geelon So

2017-12-14

If you already have a solid understanding of word embeddings and are well into your data science career, skip ahead to the next part!

Subspace-Embedding

By Junxiong Wang

2017-10-10

Subspace embedding is a powerful tool to simplify the matrix calculation and analyze high dimensional data, especially for sparse matrix.

Dimensionality Reduction via JL Lemma and Random Projection

By Junxiong Wang

2017-10-10

Nowadays, dimensionality is a serious problem of data analysis as the huge data we experience today results in very sparse sets and very high dimensions. Although, data scientists have long used tools such as principal component analysis (PCA) and independent component analysis (ICA) to project the high-dimensional data onto a subspace, but all those techniques reply on the computation of the eigenvectors of a $n \times n$ matrix, a very expensive operation (e.g., spectral decomposition) for high dimension $n$. Moreover, even though eigenspace has many important properties, it does not lead good approximations for many useful measures such as vector norms. We discuss another method random projection to reduce dimensionality.