Introduction to RPCA

The data we collect usually have the low rank property, but the property will vanished when the data is collected causing the noisy, but we can still decomposite the matrix into low-rank matrix and spares error matrix from the corruped data.

$D=\underbrace{A}_{\text{low rank matrix}}+\underbrace{E}_{\text{sparse matrix}} \notag$

Traditional approach for solving this problem is using PCA (Principal Components Analysis), there are many interpretation to PCA, one relate to rank is despiting the low value singular value as this componets contribute less to the data. Thus, it can be considered as the noisy. So we take the $k^{th}$ largest singular value and drop the rest , this can be represnt as the following formulation :

$\mathop{min}_{A,E} \|E \|_F, \ \ \ \ \text{subject to } \ rank(A)\leq r, D = A + E \notag$

PCA has a shortage that it is not robust to the outliers, then the RPCA (Robust Principal Components Analysis) came out, RPCA could making the matrix recovery whether the noisy is large or not only if the sparse property is confirmed, the original form of the RPCA can be written as :

$\mathop{min}_{A,E} rank(A) + \|E\|_0, \ \ \ \ \text{subject to } \ D = A + E \notag$

The optimization formulation above is non-convex and is hard to get the solution, we can use the convex relax technology apply on it, then it turn out into the most used and the most efficient from:

$\mathop{min}_{A,E} \|A\|_* + \|E\|_1, \ \ \ \ \text{subject to } \ D = A + E \notag$

$| \cdot |_*$ is the unclear norm, which is the sum of the all singular values : $\sum_i^n\sigma_i$, $l_1$ norm of matrix $| \cdot |_1$ is the sum of absolute value of all the element : $\sum_i^n \sum_j^n |D_{ij}|$ .

Algorithm of RPCA

Before introducting the Algorithm, we first introducing the two operators

Singular Value Thresholding

The optimal solution to the optimization problem : $\frac{1}{2} | X- Y |_F^2 + \tau |X|_*$ with the variable $X$ is thresholing the singular value of $X$

$\begin{align} \mathcal{D}_{\tau}(X) := U \mathcal{D}_{\tau} (\Sigma) V^{\prime} , \ \ \mathcal{D}_{\tau}(\Sigma) = diag ( \{ \sigma_i - \tau \} ) \notag \\ \mathcal{D}_{\tau}(Y) = \mathop{arg} \mathop{min}_{X} \left \{ \frac{1}{2} \| X- Y \|_F^2 + \tau \|X\|_* \right \} \notag \end{align}$

Soft Thresholding

As same as the $l_1$ norm in vector, thresholding the absolute value of all the element in $X$.

$\begin{align} \psi_{st}(Y) = \mathop{arg} \mathop{min}_{X} \left \{ \frac{1}{2} \| X- Y \|_F^2 + \tau \|X\|_1 \right \} \notag \end{align}$

There are various methods to solving the RPCA problem, the most successful one is slove the Augmented Lagrangian function of the original problem which we called ALM algorithm, the Augmented Lagrangian function is:

$\begin{align} L(A,E,Y,\mu) = \|A\|_* + \lambda\|E\|_1+ \langle Y,D-A-E \rangle + \frac{\mu}{2} \| D- A -E \|_F^2 \notag \end{align}$

Usually, we use ADMM to slove the ALM problems :

$\begin{align} A_{k+1} &= SVT_{1/\mu_k}(D-E_k + \mu_k^{-1} Y_k) \notag \\ E_{k+1} &= ST_{\lambda/\mu_k} (D - A_{k+1} + \mu_k^{-1} Y_k) \notag \\ Y_{k+1} &= Y_k + \mu_k ( D - A_{k+1} - E_{k+1} ) \notag \end{align}$

Robust PCA

2018-07-04

Robust PCA

Introduction to RPCA

Algorithm of RPCA

Singular Value Thresholding

Soft Thresholding

Robust PCA

Introduction to RPCA

Algorithm of RPCA

Singular Value Thresholding

Soft Thresholding

thanks~