Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 287
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
Nature ; 595(7866): 283-288, 2021 07.
Artículo en Inglés | MEDLINE | ID: mdl-34010947

RESUMEN

COVID-19 manifests with a wide spectrum of clinical phenotypes that are characterized by exaggerated and misdirected host immune responses1-6. Although pathological innate immune activation is well-documented in severe disease1, the effect of autoantibodies on disease progression is less well-defined. Here we use a high-throughput autoantibody discovery technique known as rapid extracellular antigen profiling7 to screen a cohort of 194 individuals infected with SARS-CoV-2, comprising 172 patients with COVID-19 and 22 healthcare workers with mild disease or asymptomatic infection, for autoantibodies against 2,770 extracellular and secreted proteins (members of the exoproteome). We found that patients with COVID-19 exhibit marked increases in autoantibody reactivities as compared to uninfected individuals, and show a high prevalence of autoantibodies against immunomodulatory proteins (including cytokines, chemokines, complement components and cell-surface proteins). We established that these autoantibodies perturb immune function and impair virological control by inhibiting immunoreceptor signalling and by altering peripheral immune cell composition, and found that mouse surrogates of these autoantibodies increase disease severity in a mouse model of SARS-CoV-2 infection. Our analysis of autoantibodies against tissue-associated antigens revealed associations with specific clinical characteristics. Our findings suggest a pathological role for exoproteome-directed autoantibodies in COVID-19, with diverse effects on immune functionality and associations with clinical outcomes.


Asunto(s)
Autoanticuerpos/análisis , Autoanticuerpos/inmunología , COVID-19/inmunología , COVID-19/metabolismo , Proteoma/inmunología , Proteoma/metabolismo , Animales , Antígenos de Superficie/inmunología , COVID-19/patología , COVID-19/fisiopatología , Estudios de Casos y Controles , Proteínas del Sistema Complemento/inmunología , Citocinas/inmunología , Modelos Animales de Enfermedad , Progresión de la Enfermedad , Femenino , Humanos , Masculino , Ratones , Especificidad de Órganos/inmunología
2.
PLoS Biol ; 20(5): e3001506, 2022 05.
Artículo en Inglés | MEDLINE | ID: mdl-35609110

RESUMEN

The impact of Coronavirus Disease 2019 (COVID-19) mRNA vaccination on pregnancy and fertility has become a major topic of public interest. We investigated 2 of the most widely propagated claims to determine (1) whether COVID-19 mRNA vaccination of mice during early pregnancy is associated with an increased incidence of birth defects or growth abnormalities; and (2) whether COVID-19 mRNA-vaccinated human volunteers exhibit elevated levels of antibodies to the human placental protein syncytin-1. Using a mouse model, we found that intramuscular COVID-19 mRNA vaccination during early pregnancy at gestational age E7.5 did not lead to differences in fetal size by crown-rump length or weight at term, nor did we observe any gross birth defects. In contrast, injection of the TLR3 agonist and double-stranded RNA mimic polyinosinic-polycytidylic acid, or poly(I:C), impacted growth in utero leading to reduced fetal size. No overt maternal illness following either vaccination or poly(I:C) exposure was observed. We also found that term fetuses from these murine pregnancies vaccinated prior to the formation of the definitive placenta exhibit high circulating levels of anti-spike and anti-receptor-binding domain (anti-RBD) antibodies to Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) consistent with maternal antibody status, indicating transplacental transfer in the later stages of pregnancy after early immunization. Finally, we did not detect increased levels of circulating anti-syncytin-1 antibodies in a cohort of COVID-19 vaccinated adults compared to unvaccinated adults by ELISA. Our findings contradict popular claims associating COVID-19 mRNA vaccination with infertility and adverse neonatal outcomes.


Asunto(s)
COVID-19 , Animales , Anticuerpos Antivirales , COVID-19/prevención & control , Femenino , Feto , Productos del Gen env , Humanos , Ratones , Placenta/metabolismo , Embarazo , Proteínas Gestacionales , ARN Mensajero/genética , ARN Mensajero/metabolismo , SARS-CoV-2 , Vacunación
3.
Genet Epidemiol ; 47(3): 261-286, 2023 04.
Artículo en Inglés | MEDLINE | ID: mdl-36807383

RESUMEN

Gene-environment (G-E) interaction analysis plays an important role in studying complex diseases. Extensive methodological research has been conducted on G-E interaction analysis, and the existing methods are mostly based on regression techniques. In many fields including biomedicine and omics, it has been increasingly recognized that deep learning may outperform regression with its unique flexibility (e.g., in accommodating unspecified nonlinear effects) and superior prediction performance. However, there has been a lack of development in deep learning for G-E interaction analysis. In this article, we fill this important knowledge gap and develop a new analysis approach based on deep neural network in conjunction with penalization. The proposed approach can simultaneously conduct model estimation and selection (of important main G effects and G-E interactions), while uniquely respecting the "main effects, interactions" variable selection hierarchy. Simulation shows that it has superior prediction and feature selection performance. The analysis of data on lung adenocarcinoma and skin cutaneous melanoma overall survival further establishes its practical utility. Overall, this study can advance G-E interaction analysis by delivering a powerful new analysis approach based on modern deep learning.


Asunto(s)
Aprendizaje Profundo , Melanoma , Neoplasias Cutáneas , Humanos , Interacción Gen-Ambiente , Modelos Genéticos , Melanoma Cutáneo Maligno
4.
Biostatistics ; 24(2): 425-442, 2023 04 14.
Artículo en Inglés | MEDLINE | ID: mdl-37057611

RESUMEN

Cancer is a heterogeneous disease. Finite mixture of regression (FMR)-as an important heterogeneity analysis technique when an outcome variable is present-has been extensively employed in cancer research, revealing important differences in the associations between a cancer outcome/phenotype and covariates. Cancer FMR analysis has been based on clinical, demographic, and omics variables. A relatively recent and alternative source of data comes from histopathological images. Histopathological images have been long used for cancer diagnosis and staging. Recently, it has been shown that high-dimensional histopathological image features, which are extracted using automated digital image processing pipelines, are effective for modeling cancer outcomes/phenotypes. Histopathological imaging-environment interaction analysis has been further developed to expand the scope of cancer modeling and histopathological imaging-based analysis. Motivated by the significance of cancer FMR analysis and a still strong demand for more effective methods, in this article, we take the natural next step and conduct cancer FMR analysis based on models that incorporate low-dimensional clinical/demographic/environmental variables, high-dimensional imaging features, as well as their interactions. Complementary to many of the existing studies, we develop a Bayesian approach for accommodating high dimensionality, screening out noises, identifying signals, and respecting the "main effects, interactions" variable selection hierarchy. An effective computational algorithm is developed, and simulation shows advantageous performance of the proposed approach. The analysis of The Cancer Genome Atlas data on lung squamous cell cancer leads to interesting findings different from the alternative approaches.


Asunto(s)
Interacción Gen-Ambiente , Neoplasias , Humanos , Teorema de Bayes , Neoplasias/diagnóstico por imagen , Simulación por Computador , Análisis de Regresión
5.
Brief Bioinform ; 23(2)2022 03 10.
Artículo en Inglés | MEDLINE | ID: mdl-35039832

RESUMEN

Cancer is an omics disease. The development in high-throughput profiling has fundamentally changed cancer research and clinical practice. Compared with clinical, demographic and environmental data, the analysis of omics data-which has higher dimensionality, weaker signals and more complex distributional properties-is much more challenging. Developments in the literature are often 'scattered', with individual studies focused on one or a few closely related methods. The goal of this review is to assist cancer researchers with limited statistical expertise in establishing the 'overall framework' of cancer omics data analysis. To facilitate understanding, we mainly focus on intuition, concepts and key steps, and refer readers to the original publications for mathematical details. This review broadly covers unsupervised and supervised analysis, as well as individual-gene-based, gene-set-based and gene-network-based analysis. We also briefly discuss 'special topics' including interaction analysis, multi-datasets analysis and multi-omics analysis.


Asunto(s)
Genómica , Neoplasias , Análisis de Datos , Genómica/métodos , Humanos , Neoplasias/genética
6.
Brief Bioinform ; 23(5)2022 09 20.
Artículo en Inglés | MEDLINE | ID: mdl-35876281

RESUMEN

In biomedical research, the replicability of findings across studies is highly desired. In this study, we focus on cancer omics data, for which the examination of replicability has been mostly focused on important omics variables identified in different studies. In published literature, although there have been extensive attention and ad hoc discussions, there is insufficient quantitative research looking into replicability measures and their properties. The goal of this study is to fill this important knowledge gap. In particular, we consider three sensible replicability measures, for which we examine distributional properties and develop a way of making inference. Applying them to three The Cancer Genome Atlas (TCGA) datasets reveals in general low replicability and significant across-data variations. To further comprehend such findings, we resort to simulation, which confirms the validity of the findings with the TCGA data and further informs the dependence of replicability on signal level (or equivalently sample size). Overall, this study can advance our understanding of replicability for cancer omics and other studies that have identification as a key goal.


Asunto(s)
Investigación Biomédica , Neoplasias , Humanos , Neoplasias/genética , Tamaño de la Muestra
7.
Bioinformatics ; 39(12)2023 12 01.
Artículo en Inglés | MEDLINE | ID: mdl-38060266

RESUMEN

SUMMARY: Densely measured SNP data are routinely analyzed but face challenges due to its high dimensionality, especially when gene-environment interactions are incorporated. In recent literature, a functional analysis strategy has been developed, which treats dense SNP measurements as a realization of a genetic function and can 'bypass' the dimensionality challenge. However, there is a lack of portable and friendly software, which hinders practical utilization of these functional methods. We fill this knowledge gap and develop the R package FunctanSNP. This comprehensive package encompasses estimation, identification, and visualization tools and has undergone extensive testing using both simulated and real data, confirming its reliability. FunctanSNP can serve as a convenient and reliable tool for analyzing SNP and other densely measured data. AVAILABILITY AND IMPLEMENTATION: The package is available at https://CRAN.R-project.org/package=FunctanSNP.


Asunto(s)
Programas Informáticos , Reproducibilidad de los Resultados
8.
Bioinformatics ; 39(8)2023 08 01.
Artículo en Inglés | MEDLINE | ID: mdl-37490475

RESUMEN

MOTIVATION: Analyzing genetic data to identify markers and construct predictive models is of great interest in biomedical research. However, limited by cost and sample availability, genetic studies often suffer from the "small sample size, high dimensionality" problem. To tackle this problem, an integrative analysis that collectively analyzes multiple datasets with compatible designs is often conducted. For regularizing estimation and selecting relevant variables, penalization and other regularization techniques are routinely adopted. "Blindly" searching over a vast number of variables may not be efficient. RESULTS: We propose incorporating prior information to assist integrative analysis of multiple genetic datasets. To obtain accurate prior information, we adopt a convolutional neural network with an active learning strategy to label textual information from previous studies. Then the extracted prior information is incorporated using a group LASSO-based technique. We conducted a series of simulation studies that demonstrated the satisfactory performance of the proposed method. Finally, data on skin cutaneous melanoma are analyzed to establish practical utility. AVAILABILITY AND IMPLEMENTATION: Code is available at https://github.com/ldz7/PAIA. The data that support the findings in this article are openly available in TCGA (The Cancer Genome Atlas) at https://portal.gdc.cancer.gov/.


Asunto(s)
Melanoma , Neoplasias Cutáneas , Humanos , Melanoma/genética , Simulación por Computador , Genoma , Melanoma Cutáneo Maligno
9.
Stat Med ; 43(11): 2280-2297, 2024 May 20.
Artículo en Inglés | MEDLINE | ID: mdl-38553996

RESUMEN

Cancer heterogeneity analysis is essential for precision medicine. Most of the existing heterogeneity analyses only consider a single type of data and ignore the possible sparsity of important features. In cancer clinical practice, it has been suggested that two types of data, pathological imaging and omics data, are commonly collected and can produce hierarchical heterogeneous structures, in which the refined sub-subgroup structure determined by omics features can be nested in the rough subgroup structure determined by the imaging features. Moreover, sparsity pursuit has extraordinary significance and is more challenging for heterogeneity analysis, because the important features may not be the same in different subgroups, which is ignored by the existing heterogeneity analyses. Fortunately, rich information from previous literature (for example, those deposited in PubMed) can be used to assist feature selection in the present study. Advancing from the existing analyses, in this study, we propose a novel sparse hierarchical heterogeneity analysis framework, which can integrate two types of features and incorporate prior knowledge to improve feature selection. The proposed approach has satisfactory statistical properties and competitive numerical performance. A TCGA real data analysis demonstrates the practical value of our approach in analyzing data heterogeneity and sparsity.


Asunto(s)
Neoplasias , Humanos , Neoplasias/genética , Medicina de Precisión , Modelos Estadísticos , Simulación por Computador , Heterogeneidad Genética
10.
Environ Health ; 23(1): 28, 2024 Mar 19.
Artículo en Inglés | MEDLINE | ID: mdl-38504322

RESUMEN

BACKGROUND: The effects of organochlorine pesticide (OCP) exposure on the development of human papillary thyroid cancer (PTC) are not well understood. A nested case-control study was conducted with data from the U.S. Department of Defense Serum Repository (DoDSR) cohort between 2000 and 2013 to assess associations of individual OCPs serum concentrations with PTC risk. METHODS: This study included 742 histologically confirmed PTC cases (341 females, 401 males) and 742 individually-matched controls with pre-diagnostic serum samples selected from the DoDSR. Associations between categories of lipid-corrected serum concentrations of seven OCPs and PTC risk were evaluated for classical PTC and follicular PTC using conditional logistic regression, adjusted for body mass index category and military branch to compute odds ratios (OR) and 95% confidence intervals (CIs). Effect modification by sex, birth cohort, and race was examined. RESULTS: There was no evidence of associations between most of the OCPs and PTC, overall or stratified by histological subtype. Overall, there was no evidence of an association between hexachlorobenzene (HCB) and PTC, but stratified by histological subtype HCB was associated with significantly increased risk of classical PTC (third tertile above the limit of detection (LOD) vs.

Asunto(s)
Hexaclorociclohexano , Hidrocarburos Clorados , Personal Militar , Plaguicidas , Neoplasias de la Tiroides , Masculino , Humanos , Femenino , Cáncer Papilar Tiroideo/epidemiología , Hexaclorobenceno , Estudios de Casos y Controles , Neoplasias de la Tiroides/inducido químicamente , Neoplasias de la Tiroides/epidemiología
11.
Artículo en Inglés | MEDLINE | ID: mdl-38098875

RESUMEN

With the development of data collection techniques, analysis with a survival response and high-dimensional covariates has become routine. Here we consider an interaction model, which includes a set of low-dimensional covariates, a set of high-dimensional covariates, and their interactions. This model has been motivated by gene-environment (G-E) interaction analysis, where the E variables have a low dimension, and the G variables have a high dimension. For such a model, there has been extensive research on estimation and variable selection. Comparatively, inference studies with a valid false discovery rate (FDR) control have been very limited. The existing high-dimensional inference tools cannot be directly applied to interaction models, as interactions and main effects are not "equal". In this article, for high-dimensional survival analysis with interactions, we model survival using the Accelerated Failure Time (AFT) model and adopt a "weighted least squares + debiased Lasso" approach for estimation and selection. A hierarchical FDR control approach is developed for inference and respect of the "main effects, interactions" hierarchy. The asymptotic distribution properties of the debiased Lasso estimators are rigorously established. Simulation demonstrates the satisfactory performance of the proposed approach, and the analysis of a breast cancer dataset further establishes its practical utility.

12.
Entropy (Basel) ; 26(4)2024 Mar 30.
Artículo en Inglés | MEDLINE | ID: mdl-38667864

RESUMEN

In the classification task, label noise has a significant impact on models' performance, primarily manifested in the disruption of prediction consistency, thereby reducing the classification accuracy. This work introduces a novel prediction consistency regularization that mitigates the impact of label noise on neural networks by imposing constraints on the prediction consistency of similar samples. However, determining which samples should be similar is a primary challenge. We formalize the similar sample identification as a clustering problem and employ twin contrastive clustering (TCC) to address this issue. To ensure similarity between samples within each cluster, we enhance TCC by adjusting clustering prior to distribution using label information. Based on the adjusted TCC's clustering results, we first construct the prototype for each cluster and then formulate a prototype-based regularization term to enhance prediction consistency for the prototype within each cluster and counteract the adverse effects of label noise. We conducted comprehensive experiments using benchmark datasets to evaluate the effectiveness of our method under various scenarios with different noise rates. The results explicitly demonstrate the enhancement in classification accuracy. Subsequent analytical experiments confirm that the proposed regularization term effectively mitigates noise and that the adjusted TCC enhances the quality of similar sample recognition.

13.
Genet Epidemiol ; 46(5-6): 317-340, 2022 07.
Artículo en Inglés | MEDLINE | ID: mdl-35766061

RESUMEN

Penalized variable selection for high-dimensional longitudinal data has received much attention as it can account for the correlation among repeated measurements while providing additional and essential information for improved identification and prediction performance. Despite the success, in longitudinal studies, the potential of penalization methods is far from fully understood for accommodating structured sparsity. In this article, we develop a sparse group penalization method to conduct the bi-level gene-environment (G × $\times $ E) interaction study under the repeatedly measured phenotype. Within the quadratic inference function framework, the proposed method can achieve simultaneous identification of main and interaction effects on both the group and individual levels. Simulation studies have shown that the proposed method outperforms major competitors. In the case study of asthma data from the Childhood Asthma Management Program, we conduct G × $\times $ E study by using high-dimensional single nucleotide polymorphism data as genetic factors and the longitudinal trait, forced expiratory volume in 1 s, as the phenotype. Our method leads to improved prediction and identification of main and interaction effects with important implications.


Asunto(s)
Asma , Interacción Gen-Ambiente , Asma/genética , Simulación por Computador , Humanos , Estudios Longitudinales , Modelos Genéticos
14.
Biostatistics ; 23(2): 574-590, 2022 04 13.
Artículo en Inglés | MEDLINE | ID: mdl-33040145

RESUMEN

In recent biomedical research, genome-wide association studies (GWAS) have demonstrated great success in investigating the genetic architecture of human diseases. For many complex diseases, multiple correlated traits have been collected. However, most of the existing GWAS are still limited because they analyze each trait separately without considering their correlations and suffer from a lack of sufficient information. Moreover, the high dimensionality of single nucleotide polymorphism (SNP) data still poses tremendous challenges to statistical methods, in both theoretical and practical aspects. In this article, we innovatively propose an integrative functional linear model for GWAS with multiple traits. This study is the first to approximate SNPs as functional objects in a joint model of multiple traits with penalization techniques. It effectively accommodates the high dimensionality of SNPs and correlations among multiple traits to facilitate information borrowing. Our extensive simulation studies demonstrate the satisfactory performance of the proposed method in the identification and estimation of disease-associated genetic variants, compared to four alternatives. The analysis of type 2 diabetes data leads to biologically meaningful findings with good prediction accuracy and selection stability.


Asunto(s)
Diabetes Mellitus Tipo 2 , Estudio de Asociación del Genoma Completo , Diabetes Mellitus Tipo 2/genética , Estudio de Asociación del Genoma Completo/métodos , Humanos , Modelos Lineales , Fenotipo , Polimorfismo de Nucleótido Simple
15.
Brief Bioinform ; 22(3)2021 05 20.
Artículo en Inglés | MEDLINE | ID: mdl-32793970

RESUMEN

Gene expression data have played an essential role in many biomedical studies. When the number of genes is large and sample size is limited, there is a 'lack of information' problem, leading to low-quality findings. To tackle this problem, both horizontal and vertical data integrations have been developed, where vertical integration methods collectively analyze data on gene expressions as well as their regulators (such as mutations, DNA methylation and miRNAs). In this article, we conduct a selective review of vertical data integration methods for gene expression data. The reviewed methods cover both marginal and joint analysis and supervised and unsupervised analysis. The main goal is to provide a sketch of the vertical data integration paradigm without digging into too many technical details. We also briefly discuss potential pitfalls, directions for future developments and application notes.


Asunto(s)
Expresión Génica , Análisis por Conglomerados , Análisis de Datos , Humanos , Aprendizaje Automático no Supervisado
16.
Brief Bioinform ; 22(4)2021 07 20.
Artículo en Inglés | MEDLINE | ID: mdl-33313791

RESUMEN

Structures of genetic regulatory networks are not fixed. These structural perturbations can cause changes to the reachability of systems' state spaces. As system structures are related to genotypes and state spaces are related to phenotypes, it is important to study the relationship between structures and state spaces. However, there is still no method can quantitively describe the reachability differences of two state spaces caused by structural perturbations. Therefore, Difference in Reachability between State Spaces (DReSS) is proposed. DReSS index family can quantitively describe differences of reachability, attractor sets between two state spaces and can help find the key structure in a system, which may influence system's state space significantly. First, basic properties of DReSS including non-negativity, symmetry and subadditivity are proved. Then, typical examples are shown to explain the meaning of DReSS and the differences between DReSS and traditional graph distance. Finally, differences of DReSS distribution between real biological regulatory networks and random networks are compared. Results show most structural perturbations in biological networks tend to affect reachability inside and between attractor basins rather than to affect attractor set itself when compared with random networks, which illustrates that most genotype differences tend to influence the proportion of different phenotypes and only a few ones can create new phenotypes. DReSS can provide researchers with a new insight to study the relation between genotypes and phenotypes.


Asunto(s)
Algoritmos , Redes Reguladoras de Genes , Genotipo , Modelos Genéticos
17.
Bioinformatics ; 38(11): 3139-3140, 2022 05 26.
Artículo en Inglés | MEDLINE | ID: mdl-35485739

RESUMEN

SUMMARY: Gene-environment (G-E) interactions have important implications for many complex diseases. With higher dimensionality and weaker signals, G-E interaction analysis is more challenged than the analysis of main G (and E) effects. The accumulation of published literature makes it possible to borrow strength from prior information and improve analysis. In a recent study, a 'quasi-likelihood + penalization' approach was developed to effectively incorporate prior information. Here, we first extend it to linear, logistic and Poisson regressions. Such models are much more popular in practice. More importantly, we develop the R package GEInfo, which realizes this approach in a user-friendly manner. To facilitate direct comparison and routine data analysis, the package also includes functions for alternative methods and visualization. AVAILABILITY AND IMPLEMENTATION: The package is available at https://CRAN.R-project.org/package=GEInfo. SUPPLEMENTARY INFORMATION: Supplementary materials are available at Bioinformatics online.


Asunto(s)
Interacción Gen-Ambiente , Programas Informáticos
18.
Bioinformatics ; 38(10): 2855-2862, 2022 05 13.
Artículo en Inglés | MEDLINE | ID: mdl-35561185

RESUMEN

MOTIVATION: Cancer genetic heterogeneity analysis has critical implications for tumour classification, response to therapy and choice of biomarkers to guide personalized cancer medicine. However, existing heterogeneity analysis based solely on molecular profiling data usually suffers from a lack of information and has limited effectiveness. Many biomedical and life sciences databases have accumulated a substantial volume of meaningful biological information. They can provide additional information beyond molecular profiling data, yet pose challenges arising from potential noise and uncertainty. RESULTS: In this study, we aim to develop a more effective heterogeneity analysis method with the help of prior information. A network-based penalization technique is proposed to innovatively incorporate a multi-view of prior information from multiple databases, which accommodates heterogeneity attributed to both differential genes and gene relationships. To account for the fact that the prior information might not be fully credible, we propose a weighted strategy, where the weight is determined dependent on the data and can ensure that the present model is not excessively disturbed by incorrect information. Simulation and analysis of The Cancer Genome Atlas glioblastoma multiforme data demonstrate the practical applicability of the proposed method. AVAILABILITY AND IMPLEMENTATION: R code implementing the proposed method is available at https://github.com/mengyunwu2020/PECM. The data that support the findings in this paper are openly available in TCGA (The Cancer Genome Atlas) at https://portal.gdc.cancer.gov/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Glioblastoma , Programas Informáticos , Simulación por Computador , Genoma , Glioblastoma/genética , Humanos , Medicina de Precisión
19.
Bioinformatics ; 38(11): 3134-3135, 2022 05 26.
Artículo en Inglés | MEDLINE | ID: mdl-35441661

RESUMEN

SUMMARY: In the analysis of high-dimensional omics data, dimension reduction techniques-including principal component analysis (PCA), partial least squares (PLS) and canonical correlation analysis (CCA)-have been extensively used. When there are multiple datasets generated by independent studies with compatible designs, integrative analysis has been developed and shown to outperform meta-analysis, other multidatasets analysis, and individual-data analysis. To facilitate integrative dimension reduction analysis in daily practice, we develop the R package iSFun, which can comprehensively conduct integrative sparse PCA, PLS and CCA, as well as meta-analysis and stacked analysis. The package can conduct analysis under the homogeneity and heterogeneity models and with the magnitude- and sign-based contrasted penalties. As a 'byproduct', this article is the first to develop integrative analysis built on the CCA technique, further expanding the scope of integrative analysis. AVAILABILITY AND IMPLEMENTATION: The package is available at https://CRAN.R-project.org/package=iSFun. SUPPLEMENTARY INFORMATION: Supplementary materials are available at Bioinformatics online.


Asunto(s)
Programas Informáticos , Análisis de los Mínimos Cuadrados , Análisis de Componente Principal
20.
Biometrics ; 79(3): 1761-1774, 2023 09.
Artículo en Inglés | MEDLINE | ID: mdl-36524727

RESUMEN

Genetic interactions play an important role in the progression of complex diseases, providing explanation of variations in disease phenotype missed by main genetic effects. Comparatively, there are fewer studies on survival time, given its challenging characteristics such as censoring. In recent biomedical research, two-level analysis of both genes and their involved pathways has received much attention and been demonstrated as more effective than single-level analysis. However, such analysis is usually limited to main effects. Pathways are not isolated, and their interactions have also been suggested to have important contributions to the prognosis of complex diseases. In this paper, we develop a novel two-level Bayesian interaction analysis approach for survival data. This approach is the first to conduct the analysis of lower-level gene-gene interactions and higher-level pathway-pathway interactions simultaneously. Significantly advancing from the existing Bayesian studies based on the Markov Chain Monte Carlo (MCMC) technique, we propose a variational inference framework based on the accelerated failure time model with effective priors to accommodate two-level selection as well as censoring. Its computational efficiency is much desirable for high-dimensional interaction analysis. We examine performance of the proposed approach using extensive simulation. The application to TCGA melanoma and lung adenocarcinoma data leads to biologically sensible findings with satisfactory prediction accuracy and selection stability.


Asunto(s)
Teorema de Bayes , Simulación por Computador , Fenotipo , Cadenas de Markov , Método de Montecarlo
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA