Abstract
A classical approach to accurately estimating the covariance matrix Σ of a p-variate normal distribution is to draw a sample of size n > p and form a sample covariance matrix. However, many modern applications operate with much smaller sample sizes, thus calling for estimation guarantees in the regime \({n \ll p}\). We show that a sample of size n = O(m log6 p) is sufficient to accurately estimate in operator norm an arbitrary symmetric part of Σ consisting of m ≤ n nonzero entries per row. This follows from a general result on estimating Hadamard products M · Σ, where M is an arbitrary symmetric matrix.
Article PDF
Similar content being viewed by others
References
Bai Z., Yin Y.: Limit of the smallest eigenvalue of a large-dimensional sample covariance matrix. Ann. Probab. 21, 1275–1294 (1993)
Bickel P.J., Levina E.: Covariance regularization by thresholding. Ann. Stat. 36(6), 2577–2604 (2008)
Bickel P.J., Levina E.: Regularized estimation of large covariance matrices. Ann. Stat. 36(1), 199–227 (2008)
Cai, T.T., Zhou, H.H.: Minimax estimation of large covariance matrices under ℓ 1-norm. (2010) (manuscript)
Tony Cai, T., Zhang, C.-H., Harrison, H. Zhou.: Optimal rates of convergence for covariance matrix estimation. Ann. Stat. (2010) (to appear)
de la Peña, V., Giné., E.: Decoupling. From dependence to independence. Randomly stopped processes. U-statistics and processes. Martingales and beyond. Springer, New York (1999)
Noureddine El Karoui: Operator norm consistent estimation of large dimensional sparse covariance matrices. Ann. Stat. 36(6), 2717–2756 (2008)
Fornasier, M., Rauhut, H.: Compressive sensing. In: Scherzer, O. (ed.), Handbook of Mathematical Methods in Imaging. Springer, Berlin (2011)
Furrer R., Bengtsson T.: Estimation of high-dimensional prior and posterior covariance matrices in Kalman filter variants. J. Multivar. Anal. 98(2), 227–255 (2007)
Geman S.: A limit theorem for the norm of random matrices. Ann. Probab. 8, 252–261 (1980)
Johnstone I.M.: On the distribution of the largest eigenvalue in principal components analysis. Ann. Stat. 29(2), 295–327 (2001)
Johnstone I.M., Lu A.Y.: On consistency and sparsity for principal components analysis in high dimensions. J. Am. Stat. Assoc. 104(486), 682–693 (2009)
Ledoux M., Talagrand M.: Probability in Banach spaces: Isoperimetry and processes, Ergebnisse der Mathematik und ihrer Grenzgebiete (3), vol 23. Springer, Berlin (1991)
Marčenko V.A., Pastur L.A.: Distributions of eigenvalues of some sets of random matrices. Math. USSR-Sb 1, 507–536 (1967)
Rothman A.J., Levina E., Zhu J.: Generalized thresholding of large covariance matrices. J. Am. Stat. Assoc. (Theory and Methods) 104(485), 177–186 (2009)
Rothman A.J., Levina E., Zhu J.: A new approach to Cholesky-based estimation of high-dimensional covariance matrices. Biometrika 97(3), 539–550 (2010)
Silverstein J.: The smallest eigenvalue of a large-dimensional wishart matrix. Ann. Probab. 13, 1364–1368 (1985)
Vershynin, R.: Introduction to the non-asymptotic analysis of random matrices. In: Eldar, Y., Kutyniok, G. (eds.) Compressed sensing: theory and applications. Cambridge University Press, Cambridge (submitted)
Author information
Authors and Affiliations
Corresponding author
Additional information
Partially supported by NSF grants DMS 0805798 (E.L.) and FRG DMS 0918623, DMS 1001829 (R.V.).
Rights and permissions
About this article
Cite this article
Levina, E., Vershynin, R. Partial estimation of covariance matrices. Probab. Theory Relat. Fields 153, 405–419 (2012). https://doi.org/10.1007/s00440-011-0349-4
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00440-011-0349-4