Publication Date:
2019
Description:
〈h3〉Abstract〈/h3〉
〈p〉Optimization over low rank matrices has broad applications in machine learning. For large-scale problems, an attractive heuristic is to factorize the low rank matrix to a product of two much smaller matrices. In this paper, we study the nonconvex problem 〈span〉
〈span〉\(\min _{\mathbf {U}\in \mathbb {R}^{n\times r}} g(\mathbf {U})=f(\mathbf {U}\mathbf {U}^T)\)〈/span〉
〈/span〉 under the assumptions that 〈span〉
〈span〉\(f(\mathbf {X})\)〈/span〉
〈/span〉 is restricted 〈span〉
〈span〉\(\mu \)〈/span〉
〈/span〉-strongly convex and 〈em〉L〈/em〉-smooth on the set 〈span〉
〈span〉\(\{\mathbf {X}:\mathbf {X}\succeq 0,\text{ rank }(\mathbf {X})\le r\}\)〈/span〉
〈/span〉. We propose an accelerated gradient method with alternating constraint that operates directly on the 〈span〉
〈span〉\(\mathbf {U}\)〈/span〉
〈/span〉 factors and show that the method has local linear convergence rate with the optimal dependence on the condition number of 〈span〉
〈span〉\(\sqrt{L/\mu }\)〈/span〉
〈/span〉. Globally, our method converges to the critical point with zero gradient from any initializer. Our method also applies to the problem with the asymmetric factorization of 〈span〉
〈span〉\(\mathbf {X}={\widetilde{\mathbf {U}}}{\widetilde{\mathbf {V}}}^T\)〈/span〉
〈/span〉 and the same convergence result can be obtained. Extensive experimental results verify the advantage of our method.〈/p〉
Print ISSN:
0885-6125
Electronic ISSN:
1573-0565
Topics:
Computer Science