Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 61
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
Brief Bioinform ; 25(4)2024 May 23.
Artículo en Inglés | MEDLINE | ID: mdl-38888456

RESUMEN

MOTIVATION: The advent of multimodal omics data has provided an unprecedented opportunity to systematically investigate underlying biological mechanisms from distinct yet complementary angles. However, the joint analysis of multi-omics data remains challenging because it requires modeling interactions between multiple sets of high-throughput variables. Furthermore, these interaction patterns may vary across different clinical groups, reflecting disease-related biological processes. RESULTS: We propose a novel approach called Differential Canonical Correlation Analysis (dCCA) to capture differential covariation patterns between two multivariate vectors across clinical groups. Unlike classical Canonical Correlation Analysis, which maximizes the correlation between two multivariate vectors, dCCA aims to maximally recover differentially expressed multivariate-to-multivariate covariation patterns between groups. We have developed computational algorithms and a toolkit to sparsely select paired subsets of variables from two sets of multivariate variables while maximizing the differential covariation. Extensive simulation analyses demonstrate the superior performance of dCCA in selecting variables of interest and recovering differential correlations. We applied dCCA to the Pan-Kidney cohort from the Cancer Genome Atlas Program database and identified differentially expressed covariations between noncoding RNAs and gene expressions. AVAILABILITY AND IMPLEMENTATION: The R package that implements dCCA is available at https://github.com/hwiyoungstat/dCCA.


Asunto(s)
Algoritmos , Humanos , Biología Computacional/métodos , Genómica/métodos , Perfilación de la Expresión Génica/métodos , Análisis Multivariante
2.
Proc Natl Acad Sci U S A ; 120(6): e2202584120, 2023 02 07.
Artículo en Inglés | MEDLINE | ID: mdl-36730203

RESUMEN

Model organisms are instrumental substitutes for human studies to expedite basic, translational, and clinical research. Despite their indispensable role in mechanistic investigation and drug development, molecular congruence of animal models to humans has long been questioned and debated. Little effort has been made for an objective quantification and mechanistic exploration of a model organism's resemblance to humans in terms of molecular response under disease or drug treatment. We hereby propose a framework, namely Congruence Analysis for Model Organisms (CAMO), for transcriptomic response analysis by developing threshold-free differential expression analysis, quantitative concordance/discordance scores incorporating data variabilities, pathway-centric downstream investigation, knowledge retrieval by text mining, and topological gene module detection for hypothesis generation. Instead of a genome-wide vague and dichotomous answer of "poorly" or "greatly" mimicking humans, CAMO assists researchers to numerically quantify congruence, to dissect true cross-species differences from unwanted biological or cohort variabilities, and to visually identify molecular mechanisms and pathway subnetworks that are best or least mimicked by model organisms, which altogether provides foundations for hypothesis generation and subsequent translational decisions.


Asunto(s)
Perfilación de la Expresión Génica , Transcriptoma , Animales , Humanos , Genoma , Proteómica , Modelos Animales
3.
PLoS Comput Biol ; 20(1): e1011754, 2024 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-38198519

RESUMEN

Cancer models are instrumental as a substitute for human studies and to expedite basic, translational, and clinical cancer research. For a given cancer type, a wide selection of models, such as cell lines, patient-derived xenografts, organoids and genetically modified murine models, are often available to researchers. However, how to quantify their congruence to human tumors and to select the most appropriate cancer model is a largely unsolved issue. Here, we present Congruence Analysis and Selection of CAncer Models (CASCAM), a statistical and machine learning framework for authenticating and selecting the most representative cancer models in a pathway-specific manner using transcriptomic data. CASCAM provides harmonization between human tumor and cancer model omics data, systematic congruence quantification, and pathway-based topological visualization to determine the most appropriate cancer model selection. The systems approach is presented using invasive lobular breast carcinoma (ILC) subtype and suggesting CAMA1 followed by UACC3133 as the most representative cell lines for ILC research. Two additional case studies for triple negative breast cancer (TNBC) and patient-derived xenograft/organoid (PDX/PDO) are further investigated. CASCAM is generalizable to any cancer subtype and will authenticate cancer models for faithful non-human preclinical research towards precision medicine.


Asunto(s)
Medicina de Precisión , Neoplasias de la Mama Triple Negativas , Humanos , Animales , Ratones , Ensayos Antitumor por Modelo de Xenoinjerto , Neoplasias de la Mama Triple Negativas/genética , Neoplasias de la Mama Triple Negativas/patología , Perfilación de la Expresión Génica , Análisis de Sistemas
4.
Stat Med ; 43(6): 1256-1270, 2024 Mar 15.
Artículo en Inglés | MEDLINE | ID: mdl-38258898

RESUMEN

Allocating patients to treatment arms during a trial based on the observed responses accumulated up to the decision point, and sequential adaptation of this allocation, could minimize the expected number of failures or maximize total benefits to patients. In this study, we developed a Bayesian response-adaptive randomization (RAR) design targeting the endpoint of organ support-free days (OSFD) for patients admitted to the intensive care units. The OSFD is a mixture of mortality and morbidity assessed by the number of days of free of organ support within a predetermined post-randomization time-window. In the past, researchers treated OSFD as an ordinal outcome variable where the lowest category is death. We propose a novel RAR design for a composite endpoint of mortality and morbidity, for example, OSFD, by using a Bayesian mixture model with a Markov chain Monte Carlo sampling to estimate the posterior probability distribution of OSFD and determine treatment allocation ratios at each interim. Simulations were conducted to compare the performance of our proposed design under various randomization rules and different alpha spending functions. The results show that our RAR design using Bayesian inference allocated more patients to the better performing arm(s) compared to other existing adaptive rules while assuring adequate power and type I error rate control across a range of plausible clinical scenarios.


Asunto(s)
Proyectos de Investigación , Humanos , Distribución Aleatoria , Teorema de Bayes , Probabilidad , Morbilidad
5.
Stat Med ; 2024 Jun 24.
Artículo en Inglés | MEDLINE | ID: mdl-38922949

RESUMEN

The joint analysis of imaging-genetics data facilitates the systematic investigation of genetic effects on brain structures and functions with spatial specificity. We focus on voxel-wise genome-wide association analysis, which may involve trillions of single nucleotide polymorphism (SNP)-voxel pairs. We attempt to identify underlying organized association patterns of SNP-voxel pairs and understand the polygenic and pleiotropic networks on brain imaging traits. We propose a bi-clique graph structure (ie, a set of SNPs highly correlated with a cluster of voxels) for the systematic association pattern. Next, we develop computational strategies to detect latent SNP-voxel bi-cliques and an inference model for statistical testing. We further provide theoretical results to guarantee the accuracy of our computational algorithms and statistical inference. We validate our method by extensive simulation studies, and then apply it to the whole genome genetic and voxel-level white matter integrity data collected from 1052 participants of the human connectome project. The results demonstrate multiple genetic loci influencing white matter integrity measures on splenium and genu of the corpus callosum.

6.
Mol Cell Neurosci ; 127: 103895, 2023 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-37634742

RESUMEN

In the last two decades of Genome-wide association studies (GWAS), nicotine-dependence-related genetic loci (e.g., nicotinic acetylcholine receptor - nAChR subunit genes) are among the most replicable genetic findings. Although GWAS results have reported tens of thousands of SNPs within these loci, further analysis (e.g., fine-mapping) is required to identify the causal variants. However, it is computationally challenging for existing fine-mapping methods to reliably identify causal variants from thousands of candidate SNPs based on the posterior inclusion probability. To address this challenge, we propose a new method to select SNPs by jointly modeling the SNP-wise inference results and the underlying structured network patterns of the linkage disequilibrium (LD) matrix. We use adaptive dense subgraph extraction method to recognize the latent network patterns of the LD matrix and then apply group LASSO to select causal variant candidates. We applied this new method to the UK biobank data to identify the causal variant candidates for nicotine addiction. Eighty-one nicotine addiction-related SNPs (i.e.,-log(p) > 50) of nAChR were selected, which are highly correlated (average r2>0.8) although they are physically distant (e.g., >200 kilobase away) and from various genes. These findings revealed that distant SNPs from different genes can show higher LD r2 than their neighboring SNPs, and jointly contribute to a complex trait like nicotine addiction.


Asunto(s)
Estudio de Asociación del Genoma Completo , Tabaquismo , Humanos , Estudio de Asociación del Genoma Completo/métodos , Nicotina , Tabaquismo/genética , Mapeo Cromosómico , Desequilibrio de Ligamiento , Polimorfismo de Nucleótido Simple
7.
Biostatistics ; 24(1): 68-84, 2022 12 12.
Artículo en Inglés | MEDLINE | ID: mdl-34363675

RESUMEN

Clustering with variable selection is a challenging yet critical task for modern small-n-large-p data. Existing methods based on sparse Gaussian mixture models or sparse $K$-means provide solutions to continuous data. With the prevalence of RNA-seq technology and lack of count data modeling for clustering, the current practice is to normalize count expression data into continuous measures and apply existing models with a Gaussian assumption. In this article, we develop a negative binomial mixture model with lasso or fused lasso gene regularization to cluster samples (small $n$) with high-dimensional gene features (large $p$). A modified EM algorithm and Bayesian information criterion are used for inference and determining tuning parameters. The method is compared with existing methods using extensive simulations and two real transcriptomic applications in rat brain and breast cancer studies. The result shows the superior performance of the proposed count data model in clustering accuracy, feature selection, and biological interpretation in pathways.


Asunto(s)
Modelos Estadísticos , Humanos , RNA-Seq , Teorema de Bayes , Análisis por Conglomerados , Distribución Normal
8.
J Neurosci Res ; 101(9): 1471-1483, 2023 09.
Artículo en Inglés | MEDLINE | ID: mdl-37330925

RESUMEN

Elevated arterial blood pressure (BP) is a common risk factor for cerebrovascular and cardiovascular diseases, but no causal relationship has been established between BP and cerebral white matter (WM) integrity. In this study, we performed a two-sample Mendelian randomization (MR) analysis with individual-level data by defining two nonoverlapping sets of European ancestry individuals (genetics-exposure set: N = 203,111; mean age = 56.71 years, genetics-outcome set: N = 16,156; mean age = 54.61 years) from UK Biobank to evaluate the causal effects of BP on regional WM integrity, measured by fractional anisotropy of diffusion tensor imaging. Two BP traits: systolic and diastolic blood pressure were used as exposures. Genetic variant was carefully selected as instrumental variable (IV) under the MR analysis assumptions. We existing large-scale genome-wide association study summary data for validation. The main method used was a generalized version of inverse-variance weight method while other MR methods were also applied for consistent findings. Two additional MR analyses were performed to exclude the possibility of reverse causality. We found significantly negative causal effects (FDR-adjusted p < .05; every 10 mmHg increase in BP leads to a decrease in FA value by .4% ~ 2%) of BP traits on a union set of 17 WM tracts, including brain regions related to cognitive function and memory. Our study extended the previous findings of association to causation for regional WM integrity, providing insights into the pathological processes of elevated BP that might chronically alter the brain microstructure in different regions.


Asunto(s)
Sustancia Blanca , Humanos , Persona de Mediana Edad , Presión Sanguínea/genética , Sustancia Blanca/diagnóstico por imagen , Imagen de Difusión Tensora/métodos , Análisis de la Aleatorización Mendeliana , Estudio de Asociación del Genoma Completo , Polimorfismo de Nucleótido Simple
9.
Bioinformatics ; 38(9): 2481-2487, 2022 04 28.
Artículo en Inglés | MEDLINE | ID: mdl-35218338

RESUMEN

MOTIVATION: The collection of temporal or perturbed data is often a prerequisite for reconstructing dynamic networks in most cases. However, these types of data are seldom available for genomic studies in medicine, thus significantly limiting the use of dynamic networks to characterize the biological principles underlying human health and diseases. RESULTS: We proposed a statistical framework to recover disease risk-associated pseudo-dynamic networks (DRDNet) from steady-state data. We incorporated a varying coefficient model with multiple ordinary differential equations to learn a series of networks. We analyzed the publicly available Genotype-Tissue Expression data to construct networks associated with hypertension risk, and biological findings showed that key genes constituting these networks had pivotal and biologically relevant roles associated with the vascular system. We also provided the selection consistency of the proposed learning procedure and evaluated its utility through extensive simulations. AVAILABILITY AND IMPLEMENTATION: DRDNet is implemented in the R language, and the source codes are available at https://github.com/chencxxy28/DRDnet/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Genómica , Programas Informáticos , Humanos , Genoma
10.
Bioinformatics ; 38(17): 4078-4087, 2022 09 02.
Artículo en Inglés | MEDLINE | ID: mdl-35856716

RESUMEN

MOTIVATION: The advancement of high-throughput technology characterizes a wide variety of epigenetic modifications and noncoding RNAs across the genome involved in disease pathogenesis via regulating gene expression. The high dimensionality of both epigenetic/noncoding RNA and gene expression data make it challenging to identify the important regulators of genes. Conducting univariate test for each possible regulator-gene pair is subject to serious multiple comparison burden, and direct application of regularization methods to select regulator-gene pairs is computationally infeasible. Applying fast screening to reduce dimension first before regularization is more efficient and stable than applying regularization methods alone. RESULTS: We propose a novel screening method based on robust partial correlation to detect epigenetic and noncoding RNA regulators of gene expression over the whole genome, a problem that includes both high-dimensional predictors and high-dimensional responses. Compared to existing screening methods, our method is conceptually innovative that it reduces the dimension of both predictor and response, and screens at both node (regulators or genes) and edge (regulator-gene pairs) levels. We develop data-driven procedures to determine the conditional sets and the optimal screening threshold, and implement a fast iterative algorithm. Simulations and applications to long noncoding RNA and microRNA regulation in Kidney cancer and DNA methylation regulation in Glioblastoma Multiforme illustrate the validity and advantage of our method. AVAILABILITY AND IMPLEMENTATION: The R package, related source codes and real datasets used in this article are provided at https://github.com/kehongjie/rPCor. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Genoma , ARN Largo no Codificante , Programas Informáticos , Epigénesis Genética , Expresión Génica
11.
Clin Infect Dis ; 75(1): e241-e248, 2022 08 24.
Artículo en Inglés | MEDLINE | ID: mdl-34519774

RESUMEN

BACKGROUND: Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) epidemiology implicates airborne transmission; aerosol infectiousness and impacts of masks and variants on aerosol shedding are not well understood. METHODS: We recruited coronavirus disease 2019 (COVID-19) cases to give blood, saliva, mid-turbinate and fomite (phone) swabs, and 30-minute breath samples while vocalizing into a Gesundheit-II, with and without masks at up to 2 visits 2 days apart. We quantified and sequenced viral RNA, cultured virus, and assayed serum samples for anti-spike and anti-receptor binding domain antibodies. RESULTS: We enrolled 49 seronegative cases (mean days post onset 3.8 ±â€…2.1), May 2020 through April 2021. We detected SARS-CoV-2 RNA in 36% of fine (≤5 µm), 26% of coarse (>5 µm) aerosols, and 52% of fomite samples overall and in all samples from 4 alpha variant cases. Masks reduced viral RNA by 48% (95% confidence interval [CI], 3 to 72%) in fine and by 77% (95% CI, 51 to 89%) in coarse aerosols; cloth and surgical masks were not significantly different. The alpha variant was associated with a 43-fold (95% CI, 6.6- to 280-fold) increase in fine aerosol viral RNA, compared with earlier viruses, that remained a significant 18-fold (95% CI, 3.4- to 92-fold) increase adjusting for viral RNA in saliva, swabs, and other potential confounders. Two fine aerosol samples, collected while participants wore masks, were culture-positive. CONCLUSIONS: SARS-CoV-2 is evolving toward more efficient aerosol generation and loose-fitting masks provide significant but only modest source control. Therefore, until vaccination rates are very high, continued layered controls and tight-fitting masks and respirators will be necessary.


Asunto(s)
COVID-19 , SARS-CoV-2 , COVID-19/prevención & control , Humanos , Máscaras , ARN Viral , Aerosoles y Gotitas Respiratorias
12.
Hum Brain Mapp ; 43(16): 4970-4983, 2022 11.
Artículo en Inglés | MEDLINE | ID: mdl-36040723

RESUMEN

Severe mental illnesses (SMI), including major depressive (MDD), bipolar (BD), and schizophrenia spectrum (SSD) disorders have multifactorial risk factors and capturing their complex etiopathophysiology in an individual remains challenging. Regional vulnerability index (RVI) was used to measure individual's brain-wide similarity to the expected SMI patterns derived from meta-analytical studies. It is analogous to polygenic risk scores (PRS) that measure individual's similarity to genome-wide patterns in SMI. We hypothesized that RVI is an intermediary phenotype between genome and symptoms and is sensitive to both genetic and environmental risks for SMI. UK Biobank sample of N = 17,053/19,265 M/F (age = 64.8 ± 7.4 years) and an independent sample of SSD patients and controls (N = 115/111 M/F, age = 35.2 ± 13.4) were used to test this hypothesis. UKBB participants with MDD had significantly higher RVI-MDD (Cohen's d = 0.20, p = 1 × 10-23 ) and PRS-MDD (d = 0.17, p = 1 × 10-15 ) than nonpsychiatric controls. UKBB participants with BD and SSD showed significant elevation in the respective RVIs (d = 0.65 and 0.60; p = 3 × 10-5 and .009, respectively) and PRS (d = 0.57 and 1.34; p = .002 and .002, respectively). Elevated RVI-SSD were replicated in an independent sample (d = 0.53, p = 5 × 10-5 ). RVI-MDD and RVI-SSD but not RVI-BD were associated with childhood adversity (p < .01). In nonpsychiatric controls, elevation in RVI and PRS were associated with lower cognitive performance (p < 10-5 ) in six out of seven domains and showed specificity with disorder-associated deficits. In summary, the RVI is a novel brain index for SMI and shows similar or better specificity for SMI than PRS, and together they may complement each other in the efforts to characterize the genomic to brain level risks for SMI.


Asunto(s)
Trastorno Depresivo Mayor , Trastornos Mentales , Humanos , Herencia Multifactorial , Trastorno Depresivo Mayor/genética , Estudio de Asociación del Genoma Completo , Trastornos Mentales/genética , Encéfalo/diagnóstico por imagen , Biomarcadores , Predisposición Genética a la Enfermedad
13.
Bioinformatics ; 2021 Jan 28.
Artículo en Inglés | MEDLINE | ID: mdl-33508087

RESUMEN

MOTIVATION: The analysis of gene co-expression network (GCN) is critical in examining the gene-gene interactions and learning the underlying complex yet highly organized gene regulatory mechanisms. Numerous clustering methods have been developed to detect communities of co-expressed genes in the large network. The assumed independent community structure, however, can be oversimplified and may not adequately characterize the complex biological processes. RESULTS: We develop a new computational package to extract interconnected communities from gene co-expression network. We consider a pair of communities be interconnected if a subset of genes from one community is correlated with a subset of genes from another community. The interconnected community structure is more flexible and provides a better fit to the empirical co-expression matrix. To overcome the computational challenges, we develop efficient algorithms by leveraging advanced graph norm shrinkage approach. We validate and show the advantage of our method by extensive simulation studies. We then apply our interconnected community detection method to an RNA-seq data from The Cancer Genome Atlas (TCGA) Acute Myeloid Leukemia (AML) study and identify essential interacting biological pathways related to the immune evasion mechanism of tumor cells. AVAILABILITY: The software is available at Github: https://github.com/qwu1221/ICN and Figshare: https://figshare.com/articles/software/ICN-package/13229093. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

14.
Mol Psychiatry ; 26(7): 3646-3656, 2021 07.
Artículo en Inglés | MEDLINE | ID: mdl-32632206

RESUMEN

Psychiatric disorders are associated with accelerated aging and enhanced risk for neurodegenerative disorders. Brain aging is associated with molecular, cellular, and structural changes that are robust on the group level, yet show substantial inter-individual variability. Here we assessed deviations in gene expression from normal age-dependent trajectories, and tested their validity as predictors of risk for major mental illnesses and neurodegenerative disorders. We performed large-scale gene expression and genotype analyses in postmortem samples of two frontal cortical brain regions from 214 control subjects aged 20-90 years. Individual estimates of "molecular age" were derived from age-dependent genes, identified by robust regression analysis. Deviation from chronological age was defined as "delta age". Genetic variants associated with deviations from normal gene expression patterns were identified by expression quantitative trait loci (cis-eQTL) of age-dependent genes or genome-wide association study (GWAS) on delta age, combined into distinct polygenic risk scores (PRScis-eQTL and PRSGWAS), and tested for predicting brain disorders or pathology in independent postmortem expression datasets and clinical cohorts. In these validation datasets, molecular ages, defined by 68 and 76 age-related genes for two brain regions respectively, were positively correlated with chronological ages (r = 0.88/0.91), elevated in bipolar disorder (BP) and schizophrenia (SCZ), and unchanged in major depressive disorder (MDD). Exploratory analyses in independent clinical datasets show that PRSs were associated with SCZ and MDD diagnostics, and with cognition in SCZ and pathology in Alzheimer's disease (AD). These results suggest that older molecular brain aging is a common feature of severe mental illnesses and neurodegeneration.


Asunto(s)
Trastorno Depresivo Mayor , Trastornos Mentales , Encéfalo , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Humanos , Trastornos Mentales/genética
15.
AIDS Behav ; 26(6): 2055-2066, 2022 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-35022939

RESUMEN

Harmful alcohol consumption can significantly compromise adherence to antiretroviral therapy (ART). Prior research has identified aggregate relationships between alcohol use and ART non-adherence, largely relying on concurrent assessment of these domains. There is relatively limited evidence on more nuanced day-level associations between alcohol use and ART non-adherence, despite potentially important clinical implications. We recruited adults with HIV treatment adherence challenges and harmful alcohol use (n = 53) from HIV care in South Africa. We examined relationships between alcohol use and same and next day ART adherence, accounting for the role of weekends/holidays and participant demographics, including gender. Results demonstrated that ART adherence was significantly worse on weekend/holiday days. Next day adherence was significantly worse in the context of weekend alcohol use and among men. These results suggest the importance of tailoring intervention strategies to support ART adherence during weekend drinking and for men engaged in heavy episodic drinking.


Asunto(s)
Alcoholismo , Fármacos Anti-VIH , Infecciones por VIH , Adulto , Consumo de Bebidas Alcohólicas/epidemiología , Alcoholismo/tratamiento farmacológico , Fármacos Anti-VIH/uso terapéutico , Antirretrovirales/uso terapéutico , Femenino , Infecciones por VIH/tratamiento farmacológico , Infecciones por VIH/epidemiología , Humanos , Masculino , Cumplimiento de la Medicación , Sudáfrica/epidemiología
16.
Neuroimage ; 245: 118700, 2021 12 15.
Artículo en Inglés | MEDLINE | ID: mdl-34740793

RESUMEN

Imaging genetics analyses use neuroimaging traits as intermediate phenotypes to infer the degree of genetic contribution to brain structure and function in health and/or illness. Coefficients of relatedness (CR) summarize the degree of genetic similarity among subjects and are used to estimate the heritability - the proportion of phenotypic variance explained by genetic factors. The CR can be inferred directly from genome-wide genotype data to explain the degree of shared variation in common genetic polymorphisms (SNP-heritability) among related or unrelated subjects. We developed a central processing and graphics processing unit (CPU and GPU) accelerated Fast and Powerful Heritability Inference (FPHI) approach that linearizes likelihood calculations to overcome the ∼N2-3 computational effort dependency on sample size of classical likelihood approaches. We calculated for 60 regional and 1.3 × 105 voxel-wise traits in N = 1,206 twin and sibling participants from the Human Connectome Project (HCP) (550 M/656 F, age = 28.8 ± 3.7 years) and N = 37,432 (17,531 M/19,901 F; age = 63.7 ± 7.5 years) participants from the UK Biobank (UKBB). The FPHI estimates were in excellent agreement with heritability values calculated using Genome-wide Complex Trait Analysis software (r = 0.96 and 0.98 in HCP and UKBB sample) while significantly reducing computational (102-4 times). The regional and voxel-wise traits heritability estimates for the HCP and UKBB were likewise in excellent agreement (r = 0.63-0.76, p < 10-10). In summary, the hardware-accelerated FPHI made it practical to calculate heritability values for voxel-wise neuroimaging traits, even in very large samples such as the UKBB. The patterns of additive genetic variance in neuroimaging traits measured in a large sample of related and unrelated individuals showed excellent agreement regardless of the estimation method. The code and instruction to execute these analyses are available at www.solar-eclipse-genetics.org.


Asunto(s)
Conectoma/métodos , Fenómenos Genéticos , Neuroimagen/métodos , Adulto , Algoritmos , Bancos de Muestras Biológicas , Biología Computacional , Femenino , Estudio de Asociación del Genoma Completo , Humanos , Masculino , Persona de Mediana Edad , Fenotipo , Polimorfismo de Nucleótido Simple
17.
Stat Med ; 40(6): 1519-1534, 2021 03 15.
Artículo en Inglés | MEDLINE | ID: mdl-33482688

RESUMEN

Link prediction is a fundamental problem in network analysis. In a complex network, links can be unreported and/or under detection limits due to heterogeneous sources of noise and technical challenges during data collection. The incomplete network data can lead to an inaccurate inference of network based data analysis. We propose a parametric link prediction model and consider latent links as misclassified binary outcomes. We develop new algorithms to optimize model parameters and yield robust predictions of unobserved links. Theoretical properties of the predictive model are also discussed. We apply the new method to a partially observed social network data and incomplete brain network data. The results demonstrate that our method outperforms the existing latent-link prediction methods.


Asunto(s)
Algoritmos , Humanos
18.
Bioinformatics ; 35(9): 1597-1599, 2019 05 01.
Artículo en Inglés | MEDLINE | ID: mdl-30304367

RESUMEN

SUMMARY: The rapid advances of omics technologies have generated abundant genomic data in public repositories and effective analytical approaches are critical to fully decipher biological knowledge inside these data. Meta-analysis combines multiple studies of a related hypothesis to improve statistical power, accuracy and reproducibility beyond individual study analysis. To date, many transcriptomic meta-analysis methods have been developed, yet few thoughtful guidelines exist. Here, we introduce a comprehensive analytical pipeline and browser-based software suite, called MetaOmics, to meta-analyze multiple transcriptomic studies for various biological purposes, including quality control, differential expression analysis, pathway enrichment analysis, differential co-expression network analysis, prediction, clustering and dimension reduction. The pipeline includes many public as well as >10 in-house transcriptomic meta-analytic methods with data-driven and biological-aim-driven strategies, hands-on protocols, an intuitive user interface and step-by-step instructions. AVAILABILITY AND IMPLEMENTATION: MetaOmics is freely available at https://github.com/metaOmics/metaOmics. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Programas Informáticos , Transcriptoma , Perfilación de la Expresión Génica , Genómica , Reproducibilidad de los Resultados
19.
Bioinformatics ; 34(22): 3801-3808, 2018 11 15.
Artículo en Inglés | MEDLINE | ID: mdl-30184058

RESUMEN

Motivation: Integrative analysis of multi-omics data from different high-throughput experimental platforms provides valuable insight into regulatory mechanisms associated with complex diseases, and gains statistical power to detect markers that are otherwise overlooked by single-platform omics analysis. In practice, a significant portion of samples may not be measured completely due to insufficient tissues or restricted budget (e.g. gene expression profile are measured but not methylation). Current multi-omics integrative methods require complete data. A common practice is to ignore samples with any missing platform and perform complete case analysis, which leads to substantial loss of statistical power. Methods: In this article, inspired by the popular Integrative Bayesian Analysis of Genomics data (iBAG), we propose a full Bayesian model that allows incorporation of samples with missing omics data. Results: Simulation results show improvement of the new full Bayesian approach in terms of outcome prediction accuracy and feature selection performance when sample size is limited and proportion of missingness is large. When sample size is large or the proportion of missingness is low, incorporating samples with missingness may introduce extra inference uncertainty and generate worse prediction and feature selection performance. To determine whether and how to incorporate samples with missingness, we propose a self-learning cross-validation (CV) decision scheme. Simulations and a real application on child asthma dataset demonstrate superior performance of the CV decision scheme when various types of missing mechanisms are evaluated. Availability and implementation: Freely available on the GitHub at https://github.com/CHPGenetics/FBM. Supplementary information: Supplementary data are available at Bioinformatics online.


Asunto(s)
Teorema de Bayes , Genómica , Transcriptoma , Humanos , Proyectos de Investigación , Tamaño de la Muestra
20.
Proc Natl Acad Sci U S A ; 113(1): 206-11, 2016 Jan 05.
Artículo en Inglés | MEDLINE | ID: mdl-26699485

RESUMEN

With aging, significant changes in circadian rhythms occur, including a shift in phase toward a "morning" chronotype and a loss of rhythmicity in circulating hormones. However, the effects of aging on molecular rhythms in the human brain have remained elusive. Here, we used a previously described time-of-death analysis to identify transcripts throughout the genome that have a significant circadian rhythm in expression in the human prefrontal cortex [Brodmann's area 11 (BA11) and BA47]. Expression levels were determined by microarray analysis in 146 individuals. Rhythmicity in expression was found in ∼ 10% of detected transcripts (P < 0.05). Using a metaanalysis across the two brain areas, we identified a core set of 235 genes (q < 0.05) with significant circadian rhythms of expression. These 235 genes showed 92% concordance in the phase of expression between the two areas. In addition to the canonical core circadian genes, a number of other genes were found to exhibit rhythmic expression in the brain. Notably, we identified more than 1,000 genes (1,186 in BA11; 1,591 in BA47) that exhibited age-dependent rhythmicity or alterations in rhythmicity patterns with aging. Interestingly, a set of transcripts gained rhythmicity in older individuals, which may represent a compensatory mechanism due to a loss of canonical clock function. Thus, we confirm that rhythmic gene expression can be reliably measured in human brain and identified for the first time (to our knowledge) significant changes in molecular rhythms with aging that may contribute to altered cognition, sleep, and mood in later life.


Asunto(s)
Envejecimiento/genética , Ritmo Circadiano/genética , Corteza Prefrontal/fisiopatología , Transcripción Genética , Adolescente , Adulto , Anciano , Anciano de 80 o más Años , Regulación de la Expresión Génica , Genoma Humano , Humanos , Masculino , Persona de Mediana Edad , Análisis de Secuencia por Matrices de Oligonucleótidos , Sueño/genética , Adulto Joven
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA