0%

Practical Linear Algebra -- NFM & SVD

1. Motivation

We want to decompose a matrix using an outer product of two vectors

2. SVD Singular Value Decomposition

2.1. Motivation

we except the topics to be orthogonal, so it can express most information

2.2. Idea

img

2.3. Application

  • recommendation
  • calculate pseudoinverse
  • data compression
  • PCA

3. NMF Non-negative Matrix Factorization

3.1. Motivation

It’s hard to explain face image while using PCA since there is negative value

So this method avoid negative value

Benefits: fast and easy to use

3.2. Idea

Deprecate orthogonal idea, another idea: constrain them to be non-negative

Goal: Do decomposition
$$
V \approx W H
$$
Approach: add penalty to punish negative elements, that means the result will be just a approximation, and not-unique

Using SGD(Stochastic Gradient Descent), add

3.3. Application

  • Face decomposition
  • bioinformatics
  • Topic modeling

4. Faster SVD

4.1. Motivation

  • Normal SVD is pretty slow, we need to speed it up
  • Data are often missing or inaccurate
  • Take advantage if GPUs

We are just interested in the largest singular values

So introduce randomized algorithms that

  • more stable
  • needed matrix-vector products can be done in parallel
  • performance guaranteed

4.2. Idea

Use a smaller matrix

  1. Before: calculate SVD of $A(m\times n)$

  2. Basic Math: Johnson-Lindenstrauss Lemma

a small set of points in a high-dimensional space can be embedded into a space of much lower dimension in such a way that distances between the points are nearly preserved.

  1. After: calculate SCD of $B(m,r)=AQ\space,\space r<<n$

4.3. Computation

  1. Find an approximation $Q$ to the range of $A$, here range means linear combination.

That means $Q$ has similar column space with $A$ but fewer columns
$$
A\approx Q Q^tA
$$

  1. Construct $B=Q^TA$, which is small

  2. Compute the SVD of $B=S \Sigma V^{T}$

  3. Since $A\approx Q Q^T A=QB=QS \Sigma V^{T}$

if we set $U=QS$, we can get
$$
A\approx U\Sigma V^T
$$

  1. Done!

But how do we get $Q$ in step 1?

  1. Take a bunch of random vector $w_i$ and form a matrix $W$, in this way ,$AW$ represent the space that column of $A$ can span.
  2. Calculate $AW=QR$, where $Q$ form an orthonormal basis for $Aw$