Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
Stat Sci ; 36(2): 303-327, 2021 May.
Artículo en Inglés | MEDLINE | ID: mdl-34321713

RESUMEN

Factor models are a class of powerful statistical models that have been widely used to deal with dependent measurements that arise frequently from various applications from genomics and neuroscience to economics and finance. As data are collected at an ever-growing scale, statistical machine learning faces some new challenges: high dimensionality, strong dependence among observed variables, heavy-tailed variables and heterogeneity. High-dimensional robust factor analysis serves as a powerful toolkit to conquer these challenges. This paper gives a selective overview on recent advance on high-dimensional factor models and their applications to statistics including Factor-Adjusted Robust Model selection (FarmSelect) and Factor-Adjusted Robust Multiple testing (FarmTest). We show that classical methods, especially principal component analysis (PCA), can be tailored to many new problems and provide powerful tools for statistical estimation and inference. We highlight PCA and its connections to matrix perturbation theory, robust statistics, random projection, false discovery rate, etc., and illustrate through several applications how insights from these fields yield solutions to modern challenges. We also present far-reaching connections between factor models and popular statistical learning problems, including network analysis and low-rank matrix recovery.

2.
Stat Sci ; 36(2): 264-290, 2021 May.
Artículo en Inglés | MEDLINE | ID: mdl-34305305

RESUMEN

Deep learning has achieved tremendous success in recent years. In simple words, deep learning uses the composition of many nonlinear functions to model the complex dependency between input features and labels. While neural networks have a long history, recent advances have greatly improved their performance in computer vision, natural language processing, etc. From the statistical and scientific perspective, it is natural to ask: What is deep learning? What are the new characteristics of deep learning, compared with classical methods? What are the theoretical foundations of deep learning? To answer these questions, we introduce common neural network models (e.g., convolutional neural nets, recurrent neural nets, generative adversarial nets) and training techniques (e.g., stochastic gradient descent, dropout, batch normalization) from a statistical point of view. Along the way, we highlight new characteristics of deep learning (including depth and over-parametrization) and explain their practical and theoretical benefits. We also sample recent results on theories of deep learning, many of which are only suggestive. While a complete understanding of deep learning remains elusive, we hope that our perspectives and discussions serve as a stimulus for new statistical research.

3.
J Soc Psychol ; 161(5): 543-559, 2021 Sep 03.
Artículo en Inglés | MEDLINE | ID: mdl-33252317

RESUMEN

Racial disparities in conviction and incarceration have been lamentable features of legal systems for a long time. Research has addressed the attitudes and decisions of police, prosecutors, jurors, and judges in contributing to these disparities, but very little attention has been paid to defendants' own team members-i.e., criminal defense attorneys. Researchers have specifically identified this as a "scholarly gap". To address this, we conducted an empirical study of criminal defense attorneys practicing in forty-three U.S. states (N = 327). The attorneys completed both an implicit measure designed to capture racial bias (a race Implicit Association Test) and an explicit measure designed to capture interpersonal regard for clients. The results provided support for longstanding, but previously speculative, assertions of bias in criminal defense.


Asunto(s)
Criminales , Racismo , Actitud , Humanos , Abogados
4.
Ann Stat ; 48(3): 1452-1474, 2020 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-33859446

RESUMEN

Recovering low-rank structures via eigenvector perturbation analysis is a common problem in statistical machine learning, such as in factor analysis, community detection, ranking, matrix completion, among others. While a large variety of bounds are available for average errors between empirical and population statistics of eigenvectors, few results are tight for entrywise analyses, which are critical for a number of problems such as community detection. This paper investigates entrywise behaviors of eigenvectors for a large class of random matrices whose expectations are low-rank, which helps settle the conjecture in Abbe et al. (2014b) that the spectral algorithm achieves exact recovery in the stochastic block model without any trimming or cleaning steps. The key is a first-order approximation of eigenvectors under the ℓ ∞ norm: u k ≈ A u k * λ k * , where {u k } and { u k * } are eigenvectors of a random matrix A and its expectation E A , respectively. The fact that the approximation is both tight and linear in A facilitates sharp comparisons between u k and u k * . In particular, it allows for comparing the signs of u k and u k * even if ‖ u k - u k * ‖ ∞ is large. The results are further extended to perturbations of eigenspaces, yielding new ℓ ∞-type bounds for synchronization ( ℤ 2 -spiked Wigner model) and noisy matrix completion.

5.
J Econom ; 208(1): 5-22, 2019 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-30546195

RESUMEN

In this paper, we study robust covariance estimation under the approximate factor model with observed factors. We propose a novel framework to first estimate the initial joint covariance matrix of the observed data and the factors, and then use it to recover the covariance matrix of the observed data. We prove that once the initial matrix estimator is good enough to maintain the element-wise optimal rate, the whole procedure will generate an estimated covariance with desired properties. For data with only bounded fourth moment, we propose to use adaptive Huber loss minimization to give the initial joint covariance estimation. This approach is applicable to a much wider class of distributions, beyond sub-Gaussian and elliptical distributions. We also present an asymptotic result for adaptive Huber's M-estimator with a diverging parameter. The conclusions are demonstrated by extensive simulations and real data analysis.

6.
J Mach Learn Res ; 182018 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-31749664

RESUMEN

In statistics and machine learning, we are interested in the eigenvectors (or singular vectors) of certain matrices (e.g. covariance matrices, data matrices, etc). However, those matrices are usually perturbed by noises or statistical errors, either from random sampling or structural patterns. The Davis-Kahan sin θ theorem is often used to bound the difference between the eigenvectors of a matrix A and those of a perturbed matrix A ˜ = A + E , in terms of l 2 norm. In this paper, we prove that when A is a low-rank and incoherent matrix, the l ∞ norm perturbation bound of singular vectors (or eigenvectors in the symmetric case) is smaller by a factor of d 1 or d 2 for left and right vectors, where d 1 and d 2 are the matrix dimensions. The power of this new perturbation result is shown in robust covariance estimation, particularly when random variables have heavy tails. There, we propose new robust covariance estimators and establish their asymptotic properties using the newly developed perturbation bound. Our theoretical results are verified through extensive numerical experiments.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...