Your browser doesn't support javascript.
loading
: 20 | 50 | 100
1 - 8 de 8
1.
IEEE Trans Biomed Eng ; 71(4): 1378-1390, 2024 Apr.
Article En | MEDLINE | ID: mdl-37995175

OBJECTIVE: We address the problem of finding brain connectivities that are associated with a clinical outcome or phenotype. METHODS: The proposed framework regresses a (scalar) clinical outcome on matrix-variate predictors which arise in the form of brain connectivity matrices. For example, in a large cohort of subjects we estimate those regions of functional connectivities that are associated with neurocognitive scores. We approach this high-dimensional yet highly structured estimation problem by formulating a regularized estimation process that results in a low-rank coefficient matrix having a sparse set of nonzero entries which represent regions of biologically relevant connectivities. In contrast to the recent literature on estimating a sparse, low-rank matrix from a single noisy observation, our scalar-on-matrix regression framework produces a data-driven extraction of structures that are associated with a clinical response. The method, called Sparsity Inducing Nuclear-Norm Estimator (SpINNEr), simultaneously constrains the regression coefficient matrix in two ways: a nuclear norm penalty encourages low-rank structure while an l1 norm encourages entry-wise sparsity. RESULTS: Our simulations show that SpINNEr outperforms other methods in estimation accuracy when the response-related entries (representing the brain's functional connectivity) are arranged in well-connected communities. SpINNEr is applied to investigate associations between HIV-related outcomes and functional connectivity in the human brain. CONCLUSION AND SIGNIFICANCE: Overall, this work demonstrates the potential of SpINNEr to recover sparse and low-rank estimates under scalar-on-matrix regression framework.


Algorithms , Brain , Humans , Brain/diagnostic imaging , Brain/physiology
2.
Front Neurosci ; 16: 957282, 2022.
Article En | MEDLINE | ID: mdl-36248659

Studying the association of the brain's structure and function with neurocognitive outcomes requires a comprehensive analysis that combines different sources of information from a number of brain-imaging modalities. Recently developed regularization methods provide a novel approach using information about brain structure to improve the estimation of coefficients in the linear regression models. Our proposed method, which is a special case of the Tikhonov regularization, incorporates structural connectivity derived with Diffusion Weighted Imaging and cortical distance information in the penalty term. Corresponding to previously developed methods that inform the estimation of the regression coefficients, we incorporate additional information via a Laplacian matrix based on the proximity measure on the cortical surface. Our contribution consists of constructing a principled formulation of the penalty term and testing the performance of the proposed approach via extensive simulation studies and a brain-imaging application. The penalty term is constructed as a weighted combination of structural connectivity and proximity between cortical areas. Simulation studies mimic the real brain-imaging settings. We apply our approach to the study of data collected in the Human Connectome Project, where the cortical properties of the left hemisphere are found to be associated with vocabulary comprehension.

3.
Quant Finance ; 22(2): 349-366, 2022.
Article En | MEDLINE | ID: mdl-35465255

Index tracking and hedge fund replication aim at cloning the return time series properties of a given benchmark, by either using only a subset of its original constituents or by a set of risk factors. In this paper, we propose a model that relies on the Sorted ℓ 1 Penalized Estimator, called SLOPE, for index tracking and hedge fund replication. We show that SLOPE is capable of not only providing sparsity, but also to form groups among assets depending on their partial correlation with the index or the hedge fund return times series. The grouping structure can then be exploited to create individual investment strategies that allow building portfolios with a smaller number of active positions, but still comparable tracking properties. Considering equity index data and hedge fund returns, we discuss the real-world properties of SLOPE based approaches with respect to state-of-the art approaches.

4.
Can J Stat ; 49(1): 203-227, 2021 Mar.
Article En | MEDLINE | ID: mdl-35002039

One of the challenging problems in neuroimaging is the principled incorporation of information from different imaging modalities. Data from each modality are frequently analyzed separately using, for instance, dimensionality reduction techniques, which result in a loss of mutual information. We propose a novel regularization method, generalized ridgified Partially Empirical Eigenvectors for Regression (griPEER), to estimate associations between the brain structure features and a scalar outcome within the generalized linear regression framework. griPEER improves the regression coefficient estimation by providing a principled approach to use external information from the structural brain connectivity. Specifically, we incorporate a penalty term, derived from the structural connectivity Laplacian matrix, in the penalized generalized linear regression. In this work, we address both theoretical and computational issues and demonstrate the robustness of our method despite incomplete information about the structural brain connectivity. In addition, we also provide a significance testing procedure for performing inference on the estimated coefficients. Finally, griPEER is evaluated both in extensive simulation studies and using clinical data to classify HIV+ and HIV- individuals.


L'un des défis en imagerie cérébrale consiste à établir les principes pour incorporer de l'information provenant de différentes modalités d'imagerie. Les données de chaque modalité sont fréquemment analysées séparément, exploitant par exemple des techniques de réduction de la dimension, ce qui conduit à une perte d'information mutuelle. Les auteurs proposent une nouvelle méthode de régularisation, griPEER (ou par vecteurs propres ridgifiés partiellement empiriques généralisés pour la régression) afin d'estimer l'association entre des caratéristiques de structures du cerveau et une variable réponse scalaire dans le cadre d'une régression linéaire généralisée. Les griPEER améliorent l'estimation des coefficients de régression en établissant les principes d'une approche permettant d'utiliser des informations externes de connectivité des structures du cerveau. À cet effet, les auteurs ajoutent au modèle de régression pénalisée généralisé un terme de pénalité dérivé de la matrice laplacienne de connectivité structurelle. Les auteurs résolvent des problèmes théoriques et calculatoires, puis démontrent la robustesse de leur méthode lorsque l'information à propos de la connectivité du cerveau est incomplète. De plus, ils présentent une procédure de test d'hypothèse permettant de l'inférence au sujet des paramètres estimés. Finalement, les auteurs évaluent les griPEER dans de vastes études de simulation et en utilisant des données cliniques afin de classifier les individus en VIH+ et VIH−.

5.
Stat Biosci ; 11(1): 47-90, 2019 Apr.
Article En | MEDLINE | ID: mdl-31217828

One of the challenging problems in brain imaging research is a principled incorporation of information from different imaging modalities. Frequently, each modality is analyzed separately using, for instance, dimensionality reduction techniques, which result in a loss of mutual information. We propose a novel regularization-method to estimate the association between the brain structure features and a scalar outcome within the linear regression framework. Our regularization technique provides a principled approach to use external information from the structural brain connectivity and inform the estimation of the regression coefficients. Our proposal extends the classical Tikhonov regularization framework by defining a penalty term based on the structural connectivity-derived Laplacian matrix. Here, we address both theoretical and computational issues. The approach is first illustrated using simulated data and compared with other penalized regression methods. We then apply our regularization method to study the associations between the alcoholism phenotypes and brain cortical thickness using a diffusion imaging derived measure of structural connectivity. Using the proposed methodology in 148 young male subjects with a risk for alcoholism, we found a negative associations between cortical thickness and drinks per drinking day in bilateral caudal anterior cingulate cortex, left lateral OFC and left precentral gyrus.

6.
J Am Stat Assoc ; 114(525): 419-433, 2019.
Article En | MEDLINE | ID: mdl-31217649

Sorted L-One Penalized Estimation (SLOPE, Bogdan et al., 2013, 2015) is a relatively new convex optimization procedure which allows for adaptive selection of regressors under sparse high dimensional designs. Here we extend the idea of SLOPE to deal with the situation when one aims at selecting whole groups of explanatory variables instead of single regressors. Such groups can be formed by clustering strongly correlated predictors or groups of dummy variables corresponding to different levels of the same qualitative predictor. We formulate the respective convex optimization problem, gSLOPE (group SLOPE), and propose an efficient algorithm for its solution. We also define a notion of the group false discovery rate (gFDR) and provide a choice of the sequence of tuning parameters for gSLOPE so that gFDR is provably controlled at a prespecified level if the groups of variables are orthogonal to each other. Moreover, we prove that the resulting procedure adapts to unknown sparsity and is asymptotically minimax with respect to the estimation of the proportions of variance of the response variable explained by regressors from different groups. We also provide a method for the choice of the regularizing sequence when variables in different groups are not orthogonal but statistically independent and illustrate its good properties with computer simulations. Finally, we illustrate the advantages of gSLOPE in the context of Genome Wide Association Studies. R package grpSLOPE with an implementation of our method is available on CRAN.

7.
IEEE/ACM Trans Comput Biol Bioinform ; 15(4): 1066-1078, 2018.
Article En | MEDLINE | ID: mdl-29990279

The method of Sorted L-One Penalized Estimation, or SLOPE, is a sparse regression method recently introduced by Bogdan et. al. [1] . It can be used to identify significant predictor variables in a linear model that may have more unknown parameters than observations. When the correlations between predictor variables are small, the SLOPE method is shown to successfully control the false discovery rate (the expected proportion of the irrelevant among all selected predictors) at a user specified level. However, the requirement for nearly uncorrelated predictors is too restrictive for genomic data, as demonstrated in our recent study [2] by an application of SLOPE to realistic simulated DNA sequence data. A possible solution is to divide the predictor variables into nearly uncorrelated groups, and to modify the procedure to select entire groups with an overall significant group effect, rather than individual predictors. Following this motivation, we extend SLOPE in the spirit of Group LASSO to Group SLOPE, a method that can handle group structures between the predictor variables, which are ubiquitous in real genomic data. Our theoretical results show that Group SLOPE controls the group-wise false discovery rate (gFDR), when groups are orthogonal to each other. For use in non-orthogonal settings, we propose two types of Monte Carlo based heuristics, which lead to gFDR control with Group SLOPE in simulations based on real SNP data. As an illustration of the merits of this method, an application of Group SLOPE to a dataset from the Framingham Heart Study results in the identification of some known DNA sequence regions associated with bone health, as well as some new candidate regions. The novel methods are implemented in the R package grpSLOPEMC , which is publicly available at https://github.com/agisga/grpSLOPEMC.


Computational Biology/methods , Regression Analysis , Algorithms , Databases, Factual , Humans , Machine Learning
8.
Genetics ; 205(1): 61-75, 2017 01.
Article En | MEDLINE | ID: mdl-27784720

With the rise of both the number and the complexity of traits of interest, control of the false discovery rate (FDR) in genetic association studies has become an increasingly appealing and accepted target for multiple comparison adjustment. While a number of robust FDR-controlling strategies exist, the nature of this error rate is intimately tied to the precise way in which discoveries are counted, and the performance of FDR-controlling procedures is satisfactory only if there is a one-to-one correspondence between what scientists describe as unique discoveries and the number of rejected hypotheses. The presence of linkage disequilibrium between markers in genome-wide association studies (GWAS) often leads researchers to consider the signal associated to multiple neighboring SNPs as indicating the existence of a single genomic locus with possible influence on the phenotype. This a posteriori aggregation of rejected hypotheses results in inflation of the relevant FDR. We propose a novel approach to FDR control that is based on prescreening to identify the level of resolution of distinct hypotheses. We show how FDR-controlling strategies can be adapted to account for this initial selection both with theoretical results and simulations that mimic the dependence structure to be expected in GWAS. We demonstrate that our approach is versatile and useful when the data are analyzed using both tests based on single markers and multiple regression. We provide an R package that allows practitioners to apply our procedure on standard GWAS format data, and illustrate its performance on lipid traits in the North Finland Birth Cohort 66 cohort study.


Genetic Association Studies/methods , Genome-Wide Association Study/methods , Models, Genetic , Cohort Studies , False Positive Reactions , Genetic Predisposition to Disease , Genome, Human , Genomics/methods , Humans , Linear Models , Linkage Disequilibrium , Polymorphism, Single Nucleotide , Predictive Value of Tests
...