Abstract
Robust principal component analysis (RPCA) has been widely used for recovering low-rank matrices in many data mining and machine learning problems. It separates a data matrix into a low-rank part and a sparse part. The convex approach has been well studied in the literature. However, state-of-The-Art algorithms for the convex approach usually liave relatively high complexity due to the need of solving (partial) singular value decompositions of large matrices. A non-convex approach, AltProj, has also been proposed with lighter complexity and better scalability. Given the true rank r of the underlying low rank matrix, AltProj has a complexity of 0(r2</n), where dxn is the size of data matrix. In this paper, we propose a novel factorisation-based model of RPCA, which has a complexity of O(kdn), where k is an upper bound of the true rank. Our method does not need the precise value of the true rank. From extensive experiments, we observe that AltProj cau work only when r is precisely known in advance; however, when the needed rank parameter r is specified to a value different from the true rank, AltProj cannot fully separate the two parts while our method succeeds. Even when both work, our method is about 4 times faster than AltProj. Our method can be used as a light-weight, scalable tool for RPCA in the absence of the precise value of the true rank.
| Original language | English |
|---|---|
| Title of host publication | Proceedings - 16th IEEE International Conference on Data Mining, ICDM 2016 |
| Editors | Francesco Bonchi, Josep Domingo-Ferrer, Ricardo Baeza-Yates, Zhi-Hua Zhou, Xindong Wu |
| Pages | 1137-1142 |
| Number of pages | 6 |
| ISBN (Electronic) | 9781509054725 |
| DOIs | |
| State | Published - Jul 2 2016 |
| Event | 16th IEEE International Conference on Data Mining, ICDM 2016 - Barcelona, Catalonia, Spain Duration: Dec 12 2016 → Dec 15 2016 |
Publication series
| Name | Proceedings - IEEE International Conference on Data Mining, ICDM |
|---|---|
| Volume | 0 |
| ISSN (Print) | 1550-4786 |
Conference
| Conference | 16th IEEE International Conference on Data Mining, ICDM 2016 |
|---|---|
| Country/Territory | Spain |
| City | Barcelona, Catalonia |
| Period | 12/12/16 → 12/15/16 |
Bibliographical note
Publisher Copyright:© 2016 IEEE.
Funding
Qiang Cheng is the corresponding author. This work is supported by National Science Foundation under grant IIS- 1218712, National Natural Science Foundation of China, under grant 11241005, and Shanxi Scholarship Council of China 2015-093.
| Funders | Funder number |
|---|---|
| National Science Foundation (NSF) | IIS- 1218712 |
| National Natural Science Foundation of China (NSFC) | 11241005 |
| Shanxi Scholarship Council of China | 2015-093 |
Keywords
- Factorization
- Non-convex
- Robust principal component analysis
- Scalable
ASJC Scopus subject areas
- General Engineering
Fingerprint
Dive into the research topics of 'A fast factorization-based approach to robust PCA'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver