Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 12 de 12
Filtrar
1.
Sci Rep ; 9(1): 1900, 2019 02 13.
Artigo em Inglês | MEDLINE | ID: mdl-30760808

RESUMO

Resting state functional connectomes are massive and complex. It is an open question, however, whether connectomes differ across individuals in a correspondingly massive number of ways, or whether most differences take a small number of characteristic forms. We systematically investigated this question and found clear evidence of low-rank structure in which a modest number of connectomic components, around 50-150, account for a sizable portion of inter-individual connectomic variation. This number was convergently arrived at with multiple methods including estimation of intrinsic dimensionality and assessment of reconstruction of out-of-sample data. In addition, we show that these connectomic components enable prediction of a broad array of neurocognitive and clinical symptom variables at levels comparable to a leading method that is trained on the whole connectome. Qualitative observation reveals that these connectomic components exhibit extensive community structure reflecting interrelationships between intrinsic connectivity networks. We provide quantitative validation of this observation using novel stochastic block model-based methods. We propose that these connectivity components form an effective basis set for quantifying and interpreting inter-individual connectomic differences, and for predicting behavioral/clinical phenotypes.


Assuntos
Encéfalo/fisiologia , Conectoma , Descanso , Encéfalo/diagnóstico por imagem , Humanos , Imageamento por Ressonância Magnética , Modelos Neurológicos , Fenótipo
2.
Ann Appl Stat ; 13(3): 1648-1677, 2019 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-33408802

RESUMO

While statistical analysis of a single network has received a lot of attention in recent years, with a focus on social networks, analysis of a sample of networks presents its own challenges which require a different set of analytic tools. Here we study the problem of classification of networks with labeled nodes, motivated by applications in neuroimaging. Brain networks are constructed from imaging data to represent functional connectivity between regions of the brain, and previous work has shown the potential of such networks to distinguish between various brain disorders, giving rise to a network classification problem. Existing approaches tend to either treat all edge weights as a long vector, ignoring the network structure, or focus on graph topology as represented by summary measures while ignoring the edge weights. Our goal is to design a classification method that uses both the individual edge information and the network structure of the data in a computationally efficient way, and that can produce a parsimonious and interpretable representation of differences in brain connectivity patterns between classes. We propose a graph classification method that uses edge weights as predictors but incorporates the network nature of the data via penalties that promote sparsity in the number of nodes, in addition to the usual sparsity penalties that encourage selection of edges. We implement the method via efficient convex optimization and provide a detailed analysis of data from two fMRI studies of schizophrenia.

3.
J Comput Graph Stat ; 24(1): 183-204, 2015 Mar 31.
Artigo em Inglês | MEDLINE | ID: mdl-26120267

RESUMO

A graphical model for ordinal variables is considered, where it is assumed that the data are generated by discretizing the marginal distributions of a latent multivariate Gaussian distribution. The relationships between these ordinal variables are then described by the underlying Gaussian graphical model and can be inferred by estimating the corresponding concentration matrix. Direct estimation of the model is computationally expensive, but an approximate EM-like algorithm is developed to provide an accurate estimate of the parameters at a fraction of the computational cost. Numerical evidence based on simulation studies shows the strong performance of the algorithm, which is also illustrated on data sets on movie ratings and an educational survey.

4.
Ann Appl Stat ; 9(2): 821-848, 2015 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-27182289

RESUMO

We consider the problem of jointly estimating a collection of graphical models for discrete data, corresponding to several categories that share some common structure. An example for such a setting is voting records of legislators on different issues, such as defense, energy, and healthcare. We develop a Markov graphical model to characterize the heterogeneous dependence structures arising from such data. The model is fitted via a joint estimation method that preserves the underlying common graph structure, but also allows for differences between the networks. The method employs a group penalty that targets the common zero interaction effects across all the networks. We apply the method to describe the internal networks of the U.S. Senate on several important issues. Our analysis reveals individual structure for each issue, distinct from the underlying well-known bipartisan structure common to all categories which we are able to extract separately. We also establish consistency of the proposed method both for parameter estimation and model selection, and evaluate its numerical performance on a number of simulated examples.

5.
Biometrics ; 70(4): 943-53, 2014 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-25099186

RESUMO

There has been a lot of work fitting Ising models to multivariate binary data in order to understand the conditional dependency relationships between the variables. However, additional covariates are frequently recorded together with the binary data, and may influence the dependence relationships. Motivated by such a dataset on genomic instability collected from tumor samples of several types, we propose a sparse covariate dependent Ising model to study both the conditional dependency within the binary data and its relationship with the additional covariates. This results in subject-specific Ising models, where the subject's covariates influence the strength of association between the genes. As in all exploratory data analysis, interpretability of results is important, and we use ℓ1 penalties to induce sparsity in the fitted graphs and in the number of selected covariates. Two algorithms to fit the model are proposed and compared on a set of simulated data, and asymptotic results are established. The results on the tumor dataset and their biological significance are discussed in detail.


Assuntos
Neoplasias da Mama/epidemiologia , Neoplasias da Mama/genética , Genes Supressores de Tumor , Modelos Estatísticos , Proteínas de Neoplasias/genética , Simulação por Computador , Feminino , Estudos de Associação Genética , Predisposição Genética para Doença/epidemiologia , Predisposição Genética para Doença/genética , Humanos , Magnetismo/métodos , Imãs , Cadeias de Markov , Noruega/epidemiologia , Prevalência , Teoria Quântica , Fatores de Risco
6.
Proc Natl Acad Sci U S A ; 108(18): 7321-6, 2011 May 03.
Artigo em Inglês | MEDLINE | ID: mdl-21502538

RESUMO

Analysis of networks and in particular discovering communities within networks has been a focus of recent work in several fields and has diverse applications. Most community detection methods focus on partitioning the entire network into communities, with the expectation of many ties within communities and few ties between. However, many networks contain nodes that do not fit in with any of the communities, and forcing every node into a community can distort results. Here we propose a new framework that extracts one community at a time, allowing for arbitrary structure in the remainder of the network, which can include weakly connected nodes. The main idea is that the strength of a community should depend on ties between its members and ties to the outside world, but not on ties between nonmembers. The proposed extraction criterion has a natural probabilistic interpretation in a wide class of models and performs well on simulated and real networks. For the case of the block model, we establish asymptotic consistency of estimated node labels and propose a hypothesis test for determining the number of communities.


Assuntos
Algoritmos , Serviços de Informação , Modelos Teóricos , Apoio Social , Processos Estocásticos
7.
Biometrika ; 98(1): 1-15, 2011 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-23049124

RESUMO

Gaussian graphical models explore dependence relationships between random variables, through the estimation of the corresponding inverse covariance matrices. In this paper we develop an estimator for such models appropriate for data from several graphical models that share the same variables and some of the dependence structure. In this setting, estimating a single graphical model would mask the underlying heterogeneity, while estimating separate models for each category does not take advantage of the common structure. We propose a method that jointly estimates the graphical models corresponding to the different categories present in the data, aiming to preserve the common structure, while allowing for differences between the categories. This is achieved through a hierarchical penalty that targets the removal of common zeros in the inverse covariance matrices across categories. We establish the asymptotic consistency and sparsity of the proposed estimator in the high-dimensional case, and illustrate its performance on a number of simulated networks. An application to learning semantic connections between terms from webpages collected from computer science departments is included.

8.
Biometrics ; 66(3): 793-804, 2010 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-19912170

RESUMO

Variable selection for clustering is an important and challenging problem in high-dimensional data analysis. Existing variable selection methods for model-based clustering select informative variables in a "one-in-all-out" manner; that is, a variable is selected if at least one pair of clusters is separable by this variable and removed if it cannot separate any of the clusters. In many applications, however, it is of interest to further establish exactly which clusters are separable by each informative variable. To address this question, we propose a pairwise variable selection method for high-dimensional model-based clustering. The method is based on a new pairwise penalty. Results on simulated and real data show that the new method performs better than alternative approaches that use ℓ(1) and ℓ(∞) penalties and offers better interpretation.


Assuntos
Análise por Conglomerados , Interpretação Estatística de Dados , Simulação por Computador , Modelos Estatísticos
9.
J Comput Graph Stat ; 19(4): 947-962, 2010.
Artigo em Inglês | MEDLINE | ID: mdl-24963268

RESUMO

We propose a procedure for constructing a sparse estimator of a multivariate regression coefficient matrix that accounts for correlation of the response variables. This method, which we call multivariate regression with covariance estimation (MRCE), involves penalized likelihood with simultaneous estimation of the regression coefficients and the covariance structure. An efficient optimization algorithm and a fast approximation are developed for computing MRCE. Using simulation studies, we show that the proposed method outperforms relevant competitors when the responses are highly correlated. We also apply the new method to a finance example on predicting asset returns. An R-package containing this dataset and code for computing MRCE and its approximation are available online.

10.
J Comput Graph Stat ; 19(4): 930-946, 2010.
Artigo em Inglês | MEDLINE | ID: mdl-25878487

RESUMO

In this article, we propose a new method for principal component analysis (PCA), whose main objective is to capture natural "blocking" structures in the variables. Further, the method, beyond selecting different variables for different components, also encourages the loadings of highly correlated variables to have the same magnitude. These two features often help in interpreting the principal components. To achieve these goals, a fusion penalty is introduced and the resulting optimization problem solved by an alternating block optimization algorithm. The method is applied to a number of simulated and real datasets and it is shown that it achieves the stated objectives. The supplemental materials for this article are available online.

11.
Phys Rev E Stat Nonlin Soft Matter Phys ; 77(4 Pt 2): 046119, 2008 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-18517702

RESUMO

The discovery of community structure is a common challenge in the analysis of network data. Many methods have been proposed for finding community structure, but few have been proposed for determining whether the structure found is statistically significant or whether, conversely, it could have arisen purely as a result of chance. In this paper we show that the significance of community structure can be effectively quantified by measuring its robustness to small perturbations in network structure. We propose a suitable method for perturbing networks and a measure of the resulting change in community structure and use them to assess the significance of community structure in a variety of networks, both real and computer generated.

12.
Appl Opt ; 46(28): 6896-906, 2007 Oct 01.
Artigo em Inglês | MEDLINE | ID: mdl-17906716

RESUMO

We present a method to construct the best linear estimate of optically active material concentration from ocean radiance spectra measured through an arbitrary atmosphere layer by a hyperspectral sensor. The algorithm accounts for sensor noise. Optical models of seawater and maritime atmosphere were used to obtain the joint distribution of spectra and concentrations required for the algorithm. The accuracy of phytoplankton retrieval is shown to be substantially lower than that of sediment and dissolved matter. In all cases, the sensor noise noticeably reduces the retrieval accuracy. Additional errors due to atmospheric interference are analyzed, and possible ways to increase the accuracy of retrieval are suggested, such as changing sensor parameters and including a priori information about observation conditions.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA