Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 28
Filtrar
1.
Front Genet ; 15: 1333855, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38313677

RESUMEN

Background: Cerebral aneurysms (CAs) are a significant cerebrovascular ailment with a multifaceted etiology influenced by various factors including heredity and environment. This study aimed to explore the possible link between different types of immune cells and the occurrence of CAs. Methods: We analyzed the connection between 731 immune cell signatures and the risk of CAs by using publicly available genetic data. The analysis included four immune features, specifically median brightness levels (MBL), proportionate cell (PC), definite cell (DC), and morphological attributes (MA). Mendelian randomization (MR) analysis was conducted using the instrumental variables (IVs) derived from the genetic variation linked to CAs. Results: After multiple test adjustment based on the FDR method, the inverse variance weighted (IVW) method revealed that 3 immune cell phenotypes were linked to the risk of CAs. These included CD45 on HLA DR+NK (odds ratio (OR), 1.116; 95% confidence interval (CI), 1.001-1.244; p = 0.0489), CX3CR1 on CD14- CD16- (OR, 0.973; 95% CI, 0.948-0.999; p = 0.0447). An immune cell phenotype CD16- CD56 on NK was found to have a significant association with the risk of CAs in reverse MR study (OR, 0.950; 95% CI, 0.911-0.990; p = 0.0156). Conclusion: Our investigation has yielded findings that support a substantial genetic link between immune cells and CAs, thereby suggesting possible implications for future clinical interventions.

2.
Nucleic Acids Res ; 51(22): e115, 2023 Dec 11.
Artículo en Inglés | MEDLINE | ID: mdl-37941153

RESUMEN

In the analysis of both single-cell RNA sequencing (scRNA-seq) and spatially resolved transcriptomics (SRT) data, classifying cells/spots into cell/domain types is an essential analytic step for many secondary analyses. Most of the existing annotation methods have been developed for scRNA-seq datasets without any consideration of spatial information. Here, we present SpatialAnno, an efficient and accurate annotation method for spatial transcriptomics datasets, with the capability to effectively leverage a large number of non-marker genes as well as 'qualitative' information about marker genes without using a reference dataset. Uniquely, SpatialAnno estimates low-dimensional embeddings for a large number of non-marker genes via a factor model while promoting spatial smoothness among neighboring spots via a Potts model. Using both simulated and four real spatial transcriptomics datasets from the 10x Visium, ST, Slide-seqV1/2, and seqFISH platforms, we showcase the method's improved spatial annotation accuracy, including its robustness to the inclusion of marker genes for irrelevant cell/domain types and to various degrees of marker gene misspecification. SpatialAnno is computationally scalable and applicable to SRT datasets from different platforms. Furthermore, the estimated embeddings for cellular biological effects facilitate many downstream analyses.


Asunto(s)
Perfilación de la Expresión Génica , Análisis de la Célula Individual , Programas Informáticos , Perfilación de la Expresión Génica/métodos , Análisis de la Célula Individual/métodos , Transcriptoma
4.
Inflammation ; 46(3): 1047-1060, 2023 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-36801996

RESUMEN

Primary Sjogren's syndrome (pSS) is a systemic autoimmune disease that causes dysfunction of secretory glands and the specific pathogenesis is still unknown. The CXCL9, 10, 11/CXCR3 axis and G protein-coupled receptor kinase 2 (GRK2) involved in many inflammation and immunity processes. We used NOD/Ltj mice, a spontaneous SS animal model, to elucidate the pathological mechanism of CXCL9, 10, 11/CXCR3 axis promoting T lymphocyte migration by activating GRK2 in pSS. We found that CD4 + GRK2, Th17 + CXCR3 was apparently increased and Treg + CXCR3 was significantly decreased in the spleen of 4W NOD mice without sicca symptom compared to ICR mice (control group). The protein levels of IFN-γ, CXCL9, 10, 11 increased in submandibular gland (SG) tissue accompanied by obvious lymphocytic infiltration and Th17 cells overwhelmingly infiltrated relative to Treg cells at the sicca symptom occurs, and we found that the proportion of Th17 cells was increased, whereas that of Treg cells was decreased in spleen. In vitro, we used IFN-γ to stimulate human salivary gland epithelial cells (HSGECs) co-cultured with Jurkat cells, and the results showed that CXCL9, 10, 11 was increased by IFN-γ activating JAK2/STAT1 signal pathway and Jurkat cell migration increased with the raised of cell membrane GRK2 expression. HSGECs with tofacitinib or Jurkat cells with GRK2 siRNA can reduce the migration of Jurkat cells. The results indicate that CXCL9, 10, 11 significantly increased in SG tissue through IFN-γ stimulating HSGECs, and the CXCL9, 10, 11/CXCR3 axis contributes to the progress of pSS by activating GRK2 to promote T lymphocyte migration.


Asunto(s)
Síndrome de Sjögren , Ratones , Animales , Humanos , Síndrome de Sjögren/metabolismo , Ratones Endogámicos ICR , Ratones Endogámicos NOD , Linfocitos T Reguladores/metabolismo , Movimiento Celular , Quimiocina CXCL9 , Receptores CXCR3/metabolismo
5.
Nat Commun ; 14(1): 296, 2023 01 18.
Artículo en Inglés | MEDLINE | ID: mdl-36653349

RESUMEN

Spatially resolved transcriptomics involves a set of emerging technologies that enable the transcriptomic profiling of tissues with the physical location of expressions. Although a variety of methods have been developed for data integration, most of them are for single-cell RNA-seq datasets without consideration of spatial information. Thus, methods that can integrate spatial transcriptomics data from multiple tissue slides, possibly from multiple individuals, are needed. Here, we present PRECAST, a data integration method for multiple spatial transcriptomics datasets with complex batch effects and/or biological effects between slides. PRECAST unifies spatial factor analysis simultaneously with spatial clustering and embedding alignment, while requiring only partially shared cell/domain clusters across datasets. Using both simulated and four real datasets, we show improved cell/domain detection with outstanding visualization, and the estimated aligned embeddings and cell/domain labels facilitate many downstream analyses. We demonstrate that PRECAST is computationally scalable and applicable to spatial transcriptomics datasets from different platforms.


Asunto(s)
Perfilación de la Expresión Génica , Transcriptoma , Humanos , Transcriptoma/genética , Perfilación de la Expresión Génica/métodos , Análisis por Conglomerados , Análisis Espacial , Secuenciación del Exoma , Análisis de la Célula Individual/métodos
6.
Neuroreport ; 33(11): 463-469, 2022 08 03.
Artículo en Inglés | MEDLINE | ID: mdl-35775323

RESUMEN

Traumatic brain injury (TBI) is characterized by neuronal loss and subsequent brain damage and can be accompanied by transient or permanent neurological dysfunction. The recovery of cognitive function after TBI is a challenge. This study aimed at investigating whether treatment with resveratrol (RSV) could prevent cognitive dysfunction after TBI in mice. TBI mouse model using weight drop-impact method. Male mice aged from 7 to 9 weeks were randomly divided into four groups: TBI group, TBI + vehicle group, TBI + RSV group, and sham-operated control group. The animals from the TBI + vehicle group and TBI + RSV group were intraperitoneally injected at 3 and 24 h post-TBI with placebo and RSV (3%, 5 ml/kg), respectively. Two days after TBI, the hippocampus of mice was extracted, and western blot analysis was performed for Sirtuin 1 (SIRT1), synaptophysin (SYP), p38 mitogen-activated protein kinase (MAPK), and P-p38 MAPK. Moreover, behavioral functions of TBI mice were evaluated by Y maze to determine RSV efficacy in preventing cognitive impairment in TBI. RSV increased the expression of SIRT1 protein, which in turn activated the phosphorylation of p38 MAPK. Taken together, our findings suggest that RSV exerts a strong beneficial effect on improving neurological function induced by TBI.


Asunto(s)
Lesiones Traumáticas del Encéfalo , Disfunción Cognitiva , Animales , Lesiones Traumáticas del Encéfalo/complicaciones , Disfunción Cognitiva/tratamiento farmacológico , Masculino , Ratones , Fosforilación , Resveratrol/farmacología , Sirtuina 1 , Proteínas Quinasas p38 Activadas por Mitógenos/metabolismo
7.
Nucleic Acids Res ; 50(12): e72, 2022 07 08.
Artículo en Inglés | MEDLINE | ID: mdl-35349708

RESUMEN

Dimension reduction and (spatial) clustering is usually performed sequentially; however, the low-dimensional embeddings estimated in the dimension-reduction step may not be relevant to the class labels inferred in the clustering step. We therefore developed a computation method, Dimension-Reduction Spatial-Clustering (DR-SC), that can simultaneously perform dimension reduction and (spatial) clustering within a unified framework. Joint analysis by DR-SC produces accurate (spatial) clustering results and ensures the effective extraction of biologically informative low-dimensional features. DR-SC is applicable to spatial clustering in spatial transcriptomics that characterizes the spatial organization of the tissue by segregating it into multiple tissue structures. Here, DR-SC relies on a latent hidden Markov random field model to encourage the spatial smoothness of the detected spatial cluster boundaries. Underlying DR-SC is an efficient expectation-maximization algorithm based on an iterative conditional mode. As such, DR-SC is scalable to large sample sizes and can optimize the spatial smoothness parameter in a data-driven manner. With comprehensive simulations and real data applications, we show that DR-SC outperforms existing clustering and spatial clustering methods: it extracts more biologically relevant features than conventional dimension reduction methods, improves clustering performance, and offers improved trajectory inference and visualization for downstream trajectory inference analyses.


Asunto(s)
Algoritmos , Transcriptoma , Análisis por Conglomerados , RNA-Seq , Análisis de la Célula Individual/métodos , Transcriptoma/genética , Secuenciación del Exoma
8.
Bioinformatics ; 38(2): 303-310, 2022 01 03.
Artículo en Inglés | MEDLINE | ID: mdl-34499127

RESUMEN

MOTIVATION: Mendelian randomization (MR) is a valuable tool to examine the causal relationships between health risk factors and outcomes from observational studies. Along with the proliferation of genome-wide association studies, a variety of two-sample MR methods for summary data have been developed to account for horizontal pleiotropy (HP), primarily based on the assumption that the effects of variants on exposure (γ) and HP (α) are independent. In practice, this assumption is too strict and can be easily violated because of the correlated HP. RESULTS: To account for this correlated HP, we propose a Bayesian approach, MR-Corr2, that uses the orthogonal projection to reparameterize the bivariate normal distribution for γ and α, and a spike-slab prior to mitigate the impact of correlated HP. We have also developed an efficient algorithm with paralleled Gibbs sampling. To demonstrate the advantages of MR-Corr2 over existing methods, we conducted comprehensive simulation studies to compare for both type-I error control and point estimates in various scenarios. By applying MR-Corr2 to study the relationships between exposure-outcome pairs in complex traits, we did not identify the contradictory causal relationship between HDL-c and CAD. Moreover, the results provide a new perspective of the causal network among complex traits. AVAILABILITY AND IMPLEMENTATION: The developed R package and code to reproduce all the results are available at https://github.com/QingCheng0218/MR.Corr2. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Estudio de Asociación del Genoma Completo , Análisis de la Aleatorización Mendeliana , Análisis de la Aleatorización Mendeliana/métodos , Teorema de Bayes , Factores de Riesgo , Simulación por Computador
9.
Brief Bioinform ; 23(1)2022 01 17.
Artículo en Inglés | MEDLINE | ID: mdl-34849574

RESUMEN

Spatial transcriptomics has been emerging as a powerful technique for resolving gene expression profiles while retaining tissue spatial information. These spatially resolved transcriptomics make it feasible to examine the complex multicellular systems of different microenvironments. To answer scientific questions with spatial transcriptomics and expand our understanding of how cell types and states are regulated by microenvironment, the first step is to identify cell clusters by integrating the available spatial information. Here, we introduce SC-MEB, an empirical Bayes approach for spatial clustering analysis using a hidden Markov random field. We have also derived an efficient expectation-maximization algorithm based on an iterative conditional mode for SC-MEB. In contrast to BayesSpace, a recently developed method, SC-MEB is not only computationally efficient and scalable to large sample sizes but is also capable of choosing the smoothness parameter and the number of clusters. We performed comprehensive simulation studies to demonstrate the superiority of SC-MEB over some existing methods. We applied SC-MEB to analyze the spatial transcriptome of human dorsolateral prefrontal cortex tissues and mouse hypothalamic preoptic region. Our analysis results showed that SC-MEB can achieve a similar or better clustering performance to BayesSpace, which uses the true number of clusters and a fixed smoothness parameter. Moreover, SC-MEB is scalable to large 'sample sizes'. We then employed SC-MEB to analyze a colon dataset from a patient with colorectal cancer (CRC) and COVID-19, and further performed differential expression analysis to identify signature genes related to the clustering results. The heatmap of identified signature genes showed that the clusters identified using SC-MEB were more separable than those obtained with BayesSpace. Using pathway analysis, we identified three immune-related clusters, and in a further comparison, found the mean expression of COVID-19 signature genes was greater in immune than non-immune regions of colon tissue. SC-MEB provides a valuable computational tool for investigating the structural organizations of tissues from spatial transcriptomic data.


Asunto(s)
Algoritmos , COVID-19/metabolismo , Simulación por Computador , Perfilación de la Expresión Génica , SARS-CoV-2/metabolismo , Animales , Colon/metabolismo , Neoplasias Colorrectales/metabolismo , Corteza Prefontal Dorsolateral/metabolismo , Humanos , Hipotálamo/metabolismo , Cadenas de Markov , Ratones
10.
Methods Mol Biol ; 2212: 93-103, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-33733352

RESUMEN

Transcriptome-wide association studies (TWASs) integrate expression quantitative trait loci (eQTLs) studies with genome-wide association studies (GWASs) to prioritize candidate target genes for complex traits. TWASs have become increasingly popular. They have been used to analyze many complex traits with expression profiles from different tissues, successfully enhancing the discovery of genetic risk loci for complex traits. Though conceptually straightforward, some steps are required to perform the TWAS properly. Here we provide a step-by-step guide to integrate eQTL data with both GWAS individual-level data and GWAS summary statistics from complex traits.


Asunto(s)
Epistasis Genética , Pruebas Genéticas/métodos , Modelos Genéticos , Herencia Multifactorial , Programas Informáticos , Transcriptoma , Genoma Humano , Estudio de Asociación del Genoma Completo , Genotipo , Humanos , Fenotipo , Polimorfismo de Nucleótido Simple , Sitios de Carácter Cuantitativo , Incertidumbre
11.
Nucleic Acids Res ; 48(19): e109, 2020 11 04.
Artículo en Inglés | MEDLINE | ID: mdl-32978944

RESUMEN

Transcriptome-wide association studies (TWASs) integrate expression quantitative trait loci (eQTLs) studies with genome-wide association studies (GWASs) to prioritize candidate target genes for complex traits. Several statistical methods have been recently proposed to improve the performance of TWASs in gene prioritization by integrating the expression regulatory information imputed from multiple tissues, and made significant achievements in improving the ability to detect gene-trait associations. Unfortunately, most existing multi-tissue methods focus on prioritization of candidate genes, and cannot directly infer the specific functional effects of candidate genes across different tissues. Here, we propose a tissue-specific collaborative mixed model (TisCoMM) for TWASs, leveraging the co-regulation of genetic variations across different tissues explicitly via a unified probabilistic model. TisCoMM not only performs hypothesis testing to prioritize gene-trait associations, but also detects the tissue-specific role of candidate target genes in complex traits. To make full use of widely available GWASs summary statistics, we extend TisCoMM to use summary-level data, namely, TisCoMM-S2. Using extensive simulation studies, we show that type I error is controlled at the nominal level, the statistical power of identifying associated genes is greatly improved, and the false-positive rate (FPR) for non-causal tissues is well controlled at decent levels. We further illustrate the benefits of our methods in applications to summary-level GWASs data of 33 complex traits. Notably, apart from better identifying potential trait-associated genes, we can elucidate the tissue-specific role of candidate target genes. The follow-up pathway analysis from tissue-specific genes for asthma shows that the immune system plays an essential function for asthma development in both thyroid and lung tissues.


Asunto(s)
Estudio de Asociación del Genoma Completo , Modelos Estadísticos , Sitios de Carácter Cuantitativo , Transcriptoma , Asma/genética , Asma/inmunología , Predisposición Genética a la Enfermedad , Humanos , Pulmón/inmunología , Herencia Multifactorial/genética , Especificidad de Órganos , Glándula Tiroides/inmunología
12.
NAR Genom Bioinform ; 2(2): lqaa028, 2020 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-33575584

RESUMEN

The proliferation of genome-wide association studies (GWAS) has prompted the use of two-sample Mendelian randomization (MR) with genetic variants as instrumental variables (IVs) for drawing reliable causal relationships between health risk factors and disease outcomes. However, the unique features of GWAS demand that MR methods account for both linkage disequilibrium (LD) and ubiquitously existing horizontal pleiotropy among complex traits, which is the phenomenon wherein a variant affects the outcome through mechanisms other than exclusively through the exposure. Therefore, statistical methods that fail to consider LD and horizontal pleiotropy can lead to biased estimates and false-positive causal relationships. To overcome these limitations, we proposed a probabilistic model for MR analysis in identifying the causal effects between risk factors and disease outcomes using GWAS summary statistics in the presence of LD and to properly account for horizontal pleiotropy among genetic variants (MR-LDP) and develop a computationally efficient algorithm to make the causal inference. We then conducted comprehensive simulation studies to demonstrate the advantages of MR-LDP over the existing methods. Moreover, we used two real exposure-outcome pairs to validate the results from MR-LDP compared with alternative methods, showing that our method is more efficient in using all-instrumental variants in LD. By further applying MR-LDP to lipid traits and body mass index (BMI) as risk factors for complex diseases, we identified multiple pairs of significant causal relationships, including a protective effect of high-density lipoprotein cholesterol on peripheral vascular disease and a positive causal effect of BMI on hemorrhoids.

13.
Stat Methods Med Res ; 29(1): 15-28, 2020 01.
Artículo en Inglés | MEDLINE | ID: mdl-30600776

RESUMEN

In survival analysis, when a subset of subjects has extremely long survival, the two-part cure rate model has been commonly adopted. In the two-part model, the first part is for a binary response and describes the probability of cure. The second part is for a survival response and describes the probability of survival. Despite their intuitive interconnections, most of the existing works estimate the two parts without any constraint. The existing works on proportionality promote similarity in magnitudes (i.e. quantitative similarity) and can be too restrictive. In this study, for the two-part cure rate model, we propose imposing a sign-based penalty to promote similarity in signs (i.e. qualitative similarity). The proposed strategy can be more informative than those that neglect the two-part interconnections and be less restrictive than the existing proportionality works. Penalty is also imposed to select relevant variables and accommodate high-dimensional data. Numerical studies, including simulation and two data analyses, demonstrate the advantageous performance of the proposed approach.


Asunto(s)
Modelos Estadísticos , Análisis de Supervivencia , Neoplasias de la Mama/mortalidad , Carcinogénesis/genética , Carcinoma de Células Renales/genética , Carcinoma de Células Renales/mortalidad , Simulación por Computador , Femenino , Humanos , Neoplasias Renales/genética , Neoplasias Renales/mortalidad , Modelos Lineales , Masculino , Probabilidad
14.
Bioinformatics ; 36(7): 2009-2016, 2020 04 01.
Artículo en Inglés | MEDLINE | ID: mdl-31755899

RESUMEN

MOTIVATION: Although genome-wide association studies (GWAS) have deepened our understanding of the genetic architecture of complex traits, the mechanistic links that underlie how genetic variants cause complex traits remains elusive. To advance our understanding of the underlying mechanistic links, various consortia have collected a vast volume of genomic data that enable us to investigate the role that genetic variants play in gene expression regulation. Recently, a collaborative mixed model (CoMM) was proposed to jointly interrogate genome on complex traits by integrating both the GWAS dataset and the expression quantitative trait loci (eQTL) dataset. Although CoMM is a powerful approach that leverages regulatory information while accounting for the uncertainty in using an eQTL dataset, it requires individual-level GWAS data and cannot fully make use of widely available GWAS summary statistics. Therefore, statistically efficient methods that leverages transcriptome information using only summary statistics information from GWAS data are required. RESULTS: In this study, we propose a novel probabilistic model, CoMM-S2, to examine the mechanistic role that genetic variants play, by using only GWAS summary statistics instead of individual-level GWAS data. Similar to CoMM which uses individual-level GWAS data, CoMM-S2 combines two models: the first model examines the relationship between gene expression and genotype, while the second model examines the relationship between the phenotype and the predicted gene expression from the first model. Distinct from CoMM, CoMM-S2 requires only GWAS summary statistics. Using both simulation studies and real data analysis, we demonstrate that even though CoMM-S2 utilizes GWAS summary statistics, it has comparable performance as CoMM, which uses individual-level GWAS data. AVAILABILITY AND IMPLEMENTATION: The implement of CoMM-S2 is included in the CoMM package that can be downloaded from https://github.com/gordonliu810822/CoMM. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Estudio de Asociación del Genoma Completo , Transcriptoma , Genotipo , Fenotipo , Polimorfismo de Nucleótido Simple , Sitios de Carácter Cuantitativo
15.
Sci Rep ; 9(1): 13430, 2019 09 17.
Artículo en Inglés | MEDLINE | ID: mdl-31530853

RESUMEN

In recent biomedical studies, omics profiling has been extensively conducted on various types of mental disorders. In most of the existing analyses, a single type of mental disorder and a single type of omics measurement are analyzed. In the study of other complex diseases, integrative analysis, both vertical and horizontal integration, has been conducted and shown to bring significantly new insights into disease etiology, progression, biomarkers, and treatment. In this article, we showcase the applicability of integrative analysis to mental disorders. In particular, the horizontal integration of bipolar disorder and schizophrenia and the vertical integration of gene expression and copy number variation data are conducted. The analysis is based on the sparse principal component analysis, penalization, and other advanced statistical techniques. In data analysis, integration leads to biologically sensible findings, including the disease-related gene expressions, copy number variations, and their associations, which differ from the "benchmark" analysis. Overall, this study suggests the potential of integrative analysis in mental disorder research.


Asunto(s)
Biología Computacional/métodos , Variaciones en el Número de Copia de ADN , Regulación de la Expresión Génica , Trastornos Mentales/genética , Biomarcadores , Simulación por Computador , Humanos , Programas Informáticos
16.
Bioinformatics ; 35(19): 3693-3700, 2019 10 01.
Artículo en Inglés | MEDLINE | ID: mdl-30851102

RESUMEN

MOTIVATION: In genome-wide association studies (GWASs) where multiple correlated traits have been measured on participants, a joint analysis strategy, whereby the traits are analyzed jointly, can improve statistical power over a single-trait analysis strategy. There are two questions of interest to be addressed when conducting a joint GWAS analysis with multiple traits. The first question examines whether a genetic loci is significantly associated with any of the traits being tested. The second question focuses on identifying the specific trait(s) that is associated with the genetic loci. Since existing methods primarily focus on the first question, this article seeks to provide a complementary method that addresses the second question. RESULTS: We propose a novel method, Variational Inference for Multiple Correlated Outcomes (VIMCO) that focuses on identifying the specific trait that is associated with the genetic loci, when performing a joint GWAS analysis of multiple traits, while accounting for correlation among the multiple traits. We performed extensive numerical studies and also applied VIMCO to analyze two datasets. The numerical studies and real data analysis demonstrate that VIMCO improves statistical power over single-trait analysis strategies when the multiple traits are correlated and has comparable performance when the traits are not correlated. AVAILABILITY AND IMPLEMENTATION: The VIMCO software can be downloaded from: https://github.com/XingjieShi/VIMCO. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Estudio de Asociación del Genoma Completo , Programas Informáticos , Sitios Genéticos , Fenotipo , Polimorfismo de Nucleótido Simple , Proyectos de Investigación
17.
Comput Stat Data Anal ; 124: 235-251, 2018 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-30319163

RESUMEN

Penalization is a popular tool for multi- and high-dimensional data. Most of the existing computational algorithms have been developed for convex loss functions. Nonconvex loss functions can sometimes generate more robust results and have important applications. Motivated by the BLasso algorithm, this study develops the Forward and Backward Stagewise (Fabs) algorithm for nonconvex loss functions with the adaptive Lasso (aLasso) penalty. It is shown that each point along the Fabs paths is a δ-approximate solution to the aLasso problem and the Fabs paths converge to the stationary points of the aLasso problem when δ goes to zero, given that the loss function has second-order derivatives bounded from above. This study exemplifies the Fabs with an application to the penalized smooth partial rank (SPR) estimation, for which there is still a lack of effective algorithm. Extensive numerical studies are conducted to demonstrate the benefit of penalized SPR estimation using Fabs, especially under high-dimensional settings. Application to the smoothed 0-1 loss in binary classification is introduced to demonstrate its capability to work with other differentiable nonconvex loss function.

18.
Genet Epidemiol ; 41(8): 779-789, 2017 12.
Artículo en Inglés | MEDLINE | ID: mdl-28913902

RESUMEN

Gene expression (GE) studies have been playing a critical role in cancer research. Despite tremendous effort, the analysis results are still often unsatisfactory, because of the weak signals and high data dimensionality. Analysis is often further challenged by the long-tailed distributions of the outcome variables. In recent multidimensional studies, data have been collected on GEs as well as their regulators (e.g., copy number alterations (CNAs), methylation, and microRNAs), which can provide additional information on the associations between GEs and cancer outcomes. In this study, we develop an ARMI (assisted robust marker identification) approach for analyzing cancer studies with measurements on GEs as well as regulators. The proposed approach borrows information from regulators and can be more effective than analyzing GE data alone. A robust objective function is adopted to accommodate long-tailed distributions. Marker identification is effectively realized using penalization. The proposed approach has an intuitive formulation and is computationally much affordable. Simulation shows its satisfactory performance under a variety of settings. TCGA (The Cancer Genome Atlas) data on melanoma and lung cancer are analyzed, which leads to biologically plausible marker identification and superior prediction.


Asunto(s)
Biomarcadores de Tumor/genética , Modelos Genéticos , Neoplasias/genética , Biomarcadores de Tumor/metabolismo , Regulación Neoplásica de la Expresión Génica , Genes Relacionados con las Neoplasias , Humanos , Melanoma/genética , Melanoma/metabolismo , Melanoma/patología , Neoplasias/metabolismo , Neoplasias/patología , Fenotipo , Neoplasias Cutáneas/genética , Neoplasias Cutáneas/metabolismo , Neoplasias Cutáneas/patología
19.
Genet Epidemiol ; 40(5): 382-93, 2016 07.
Artículo en Inglés | MEDLINE | ID: mdl-27247027

RESUMEN

Genome-wide association studies (GWAS) have led to the identification of many genetic variants associated with complex diseases in the past 10 years. Penalization methods, with significant numerical and statistical advantages, have been extensively adopted in analyzing GWAS. This study has been partly motivated by the analysis of Genetic Analysis Workshop (GAW) 18 data, which have two notable characteristics. First, the subjects are from a small number of pedigrees and hence related. Second, for each subject, multiple correlated traits have been measured. Most of the existing penalization methods assume independence between subjects and traits and can be suboptimal. There are a few methods in the literature based on mixed modeling that can accommodate correlations. However, they cannot fully accommodate the two types of correlations while conducting effective marker selection. In this study, we develop a penalized multitrait mixed modeling approach. It accommodates the two different types of correlations and includes several existing methods as special cases. Effective penalization is adopted for marker selection. Simulation demonstrates its satisfactory performance. The GAW 18 data are analyzed using the proposed method.


Asunto(s)
Estudio de Asociación del Genoma Completo , Modelos Genéticos , Linaje , Carácter Cuantitativo Heredable , Área Bajo la Curva , Simulación por Computador , Humanos , Fenotipo , Polimorfismo de Nucleótido Simple/genética , Curva ROC
20.
Genomics ; 107(6): 223-30, 2016 06.
Artículo en Inglés | MEDLINE | ID: mdl-27141884

RESUMEN

Multiple types of genetic, epigenetic, and genomic changes have been implicated in cutaneous melanoma prognosis. Many of the existing studies are limited in analyzing a single type of omics measurement and cannot comprehensively describe the biological processes underlying prognosis. As a result, the obtained prognostic models may be less satisfactory, and the identified prognostic markers may be less informative. The recently collected TCGA (The Cancer Genome Atlas) data have a high quality and comprehensive omics measurements, making it possible to more comprehensively and more accurately model prognosis. In this study, we first describe the statistical approaches that can integrate multiple types of omics measurements with the assistance of variable selection and dimension reduction techniques. Data analysis suggests that, for cutaneous melanoma, integrating multiple types of measurements leads to prognostic models with an improved prediction performance. Informative individual markers and pathways are identified, which can provide valuable insights into melanoma prognosis.


Asunto(s)
Melanoma/genética , Pronóstico , Transcriptoma/genética , Biomarcadores de Tumor/genética , Genómica , Humanos , Melanoma/diagnóstico , Melanoma/patología , Proteómica , Neoplasias Cutáneas , Melanoma Cutáneo Maligno
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...