Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 29
Filtrar
Mais filtros

Base de dados
Tipo de documento
País de afiliação
Intervalo de ano de publicação
1.
Cell ; 184(25): 6119-6137.e26, 2021 12 09.
Artigo em Inglês | MEDLINE | ID: mdl-34890551

RESUMO

Prognostically relevant RNA expression states exist in pancreatic ductal adenocarcinoma (PDAC), but our understanding of their drivers, stability, and relationship to therapeutic response is limited. To examine these attributes systematically, we profiled metastatic biopsies and matched organoid models at single-cell resolution. In vivo, we identify a new intermediate PDAC transcriptional cell state and uncover distinct site- and state-specific tumor microenvironments (TMEs). Benchmarking models against this reference map, we reveal strong culture-specific biases in cancer cell transcriptional state representation driven by altered TME signals. We restore expression state heterogeneity by adding back in vivo-relevant factors and show plasticity in culture models. Further, we prove that non-genetic modulation of cell state can strongly influence drug responses, uncovering state-specific vulnerabilities. This work provides a broadly applicable framework for aligning cell states across in vivo and ex vivo settings, identifying drivers of transcriptional plasticity and manipulating cell state to target associated vulnerabilities.


Assuntos
Biomarcadores Tumorais/metabolismo , Carcinoma Ductal Pancreático/metabolismo , Neoplasias Pancreáticas/metabolismo , Microambiente Tumoral , Adulto , Idoso , Linhagem Celular Tumoral , Feminino , Regulação Neoplásica da Expressão Gênica , Humanos , Masculino , Pessoa de Meia-Idade , Análise de Célula Única
2.
Am J Hum Genet ; 109(5): 871-884, 2022 05 05.
Artigo em Inglês | MEDLINE | ID: mdl-35349783

RESUMO

Since 2005, genome-wide association (GWA) datasets have been largely biased toward sampling European ancestry individuals, and recent studies have shown that GWA results estimated from self-identified European individuals are not transferable to non-European individuals because of various confounding challenges. Here, we demonstrate that enrichment analyses that aggregate SNP-level association statistics at multiple genomic scales-from genes to genomic regions and pathways-have been underutilized in the GWA era and can generate biologically interpretable hypotheses regarding the genetic basis of complex trait architecture. We illustrate examples of the robust associations generated by enrichment analyses while studying 25 continuous traits assayed in 566,786 individuals from seven diverse self-identified human ancestries in the UK Biobank and the Biobank Japan as well as 44,348 admixed individuals from the PAGE consortium including cohorts of African American, Hispanic and Latin American, Native Hawaiian, and American Indian/Alaska Native individuals. We identify 1,000 gene-level associations that are genome-wide significant in at least two ancestry cohorts across these 25 traits as well as highly conserved pathway associations with triglyceride levels in European, East Asian, and Native Hawaiian cohorts.


Assuntos
Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Estudo de Associação Genômica Ampla/métodos , Humanos , Herança Multifatorial , Fenótipo , Polimorfismo de Nucleotídeo Único/genética , Grupos Raciais
3.
PLoS Comput Biol ; 20(9): e1012469, 2024 Sep 17.
Artigo em Inglês | MEDLINE | ID: mdl-39288189

RESUMO

Significant variations have been observed in viral copies generated during SARS-CoV-2 infections. However, the factors that impact viral copies and infection dynamics are not fully understood, and may be inherently dependent upon different viral and host factors. Here, we conducted virus whole genome sequencing and measured viral copies using RT-qPCR from 9,902 SARS-CoV-2 infections over a 2-year period to examine the impact of virus genetic variation on changes in viral copies adjusted for host age and vaccination status. Using a genome-wide association study (GWAS) approach, we identified multiple single-nucleotide polymorphisms (SNPs) corresponding to amino acid changes in the SARS-CoV-2 genome associated with variations in viral copies. We further applied a marginal epistasis test to detect interactions among SNPs and identified multiple pairs of substitutions located in the spike gene that have non-linear effects on viral copies. We also analyzed the temporal patterns and found that SNPs associated with increased viral copies were predominantly observed in Delta and Omicron BA.2/BA.4/BA.5/XBB infections, whereas those associated with decreased viral copies were only observed in infections with Omicron BA.1 variants. Our work showcases how GWAS can be a useful tool for probing phenotypes related to SNPs in viral genomes that are worth further exploration. We argue that this approach can be used more broadly across pathogens to characterize emerging variants and monitor therapeutic interventions.

4.
BMC Bioinformatics ; 25(1): 249, 2024 Jul 30.
Artigo em Inglês | MEDLINE | ID: mdl-39080561

RESUMO

In this paper, we aim to build a platform that will help bridge the gap between high-dimensional computation and wet-lab experimentation by allowing users to interrogate genomic signatures at multiple molecular levels and identify best next actionable steps for downstream decision making. We introduce Multioviz: a publicly accessible R package and web application platform to easily perform in silico hypothesis testing of generated gene regulatory networks. We demonstrate the utility of Multioviz by conducting an end-to-end analysis in a statistical genetics application focused on measuring the effect of in silico perturbations of complex trait architecture. By using a real dataset from the Wellcome Trust Centre for Human Genetics, we both recapitulate previous findings and propose hypotheses about the genes involved in the percentage of immune CD8+ cells found in heterogeneous stocks of mice. Source code for the Multioviz R package is available at https://github.com/lcrawlab/multio-viz and an interactive version of the platform is available at https://multioviz.ccv.brown.edu/ .


Assuntos
Redes Reguladoras de Genes , Software , Camundongos , Animais , Simulação por Computador , Humanos , Biologia Computacional/métodos
5.
PLoS Comput Biol ; 19(5): e1011162, 2023 05.
Artigo em Inglês | MEDLINE | ID: mdl-37220151

RESUMO

Natural products are chemical compounds that form the basis of many therapeutics used in the pharmaceutical industry. In microbes, natural products are synthesized by groups of colocalized genes called biosynthetic gene clusters (BGCs). With advances in high-throughput sequencing, there has been an increase of complete microbial isolate genomes and metagenomes, from which a vast number of BGCs are undiscovered. Here, we introduce a self-supervised learning approach designed to identify and characterize BGCs from such data. To do this, we represent BGCs as chains of functional protein domains and train a masked language model on these domains. We assess the ability of our approach to detect BGCs and characterize BGC properties in bacterial genomes. We also demonstrate that our model can learn meaningful representations of BGCs and their constituent domains, detect BGCs in microbial genomes, and predict BGC product classes. These results highlight self-supervised neural networks as a promising framework for improving BGC prediction and classification.


Assuntos
Produtos Biológicos , Genoma Bacteriano , Metagenoma , Família Multigênica/genética , Produtos Biológicos/metabolismo , Aprendizado de Máquina Supervisionado
6.
PLoS Genet ; 17(8): e1009754, 2021 08.
Artigo em Inglês | MEDLINE | ID: mdl-34411094

RESUMO

In this article, we present Biologically Annotated Neural Networks (BANNs), a nonlinear probabilistic framework for association mapping in genome-wide association (GWA) studies. BANNs are feedforward models with partially connected architectures that are based on biological annotations. This setup yields a fully interpretable neural network where the input layer encodes SNP-level effects, and the hidden layer models the aggregated effects among SNP-sets. We treat the weights and connections of the network as random variables with prior distributions that reflect how genetic effects manifest at different genomic scales. The BANNs software uses variational inference to provide posterior summaries which allow researchers to simultaneously perform (i) mapping with SNPs and (ii) enrichment analyses with SNP-sets on complex traits. Through simulations, we show that our method improves upon state-of-the-art association mapping and enrichment approaches across a wide range of genetic architectures. We then further illustrate the benefits of BANNs by analyzing real GWA data assayed in approximately 2,000 heterogenous stock of mice from the Wellcome Trust Centre for Human Genetics and approximately 7,000 individuals from the Framingham Heart Study. Lastly, using a random subset of individuals of European ancestry from the UK Biobank, we show that BANNs is able to replicate known associations in high and low-density lipoprotein cholesterol content.


Assuntos
Estudo de Associação Genômica Ampla/métodos , Anotação de Sequência Molecular/métodos , Animais , Genoma/genética , Genômica/métodos , Genótipo , Humanos , Modelos Genéticos , Herança Multifatorial/genética , Redes Neurais de Computação , Fenótipo , Polimorfismo de Nucleotídeo Único/genética , Software
7.
PLoS Genet ; 17(3): e1008887, 2021 03.
Artigo em Inglês | MEDLINE | ID: mdl-33735180

RESUMO

The winged insects of the order Diptera are colloquially named for their most recognizable phenotype: flight. These insects rely on flight for a number of important life history traits, such as dispersal, foraging, and courtship. Despite the importance of flight, relatively little is known about the genetic architecture of flight performance. Accordingly, we sought to uncover the genetic modifiers of flight using a measure of flies' reaction and response to an abrupt drop in a vertical flight column. We conducted a genome wide association study (GWAS) using 197 of the Drosophila Genetic Reference Panel (DGRP) lines, and identified a combination of additive and marginal variants, epistatic interactions, whole genes, and enrichment across interaction networks. Egfr, a highly pleiotropic developmental gene, was among the most significant additive variants identified. We functionally validated 13 of the additive candidate genes' (Adgf-A/Adgf-A2/CG32181, bru1, CadN, flapper (CG11073), CG15236, flippy (CG9766), CREG, Dscam4, form3, fry, Lasp/CG9692, Pde6, Snoo), and introduce a novel approach to whole gene significance screens: PEGASUS_flies. Additionally, we identified ppk23, an Acid Sensing Ion Channel (ASIC) homolog, as an important hub for epistatic interactions. We propose a model that suggests genetic modifiers of wing and muscle morphology, nervous system development and function, BMP signaling, sexually dimorphic neural wiring, and gene regulation are all important for the observed differences flight performance in a natural population. Additionally, these results represent a snapshot of the genetic modifiers affecting drop-response flight performance in Drosophila, with implications for other insects.


Assuntos
Drosophila melanogaster/genética , Drosophila/genética , Regulação da Expressão Gênica no Desenvolvimento , Variação Genética , Neurogênese/genética , Animais , Drosophila/embriologia , Drosophila melanogaster/metabolismo , Epigênese Genética , Feminino , Voo Animal , Estudos de Associação Genética , Masculino , Fenótipo , Polimorfismo de Nucleotídeo Único
8.
PLoS Comput Biol ; 18(5): e1010045, 2022 05.
Artigo em Inglês | MEDLINE | ID: mdl-35500014

RESUMO

Identifying structural differences among proteins can be a non-trivial task. When contrasting ensembles of protein structures obtained from molecular dynamics simulations, biologically-relevant features can be easily overshadowed by spurious fluctuations. Here, we present SINATRA Pro, a computational pipeline designed to robustly identify topological differences between two sets of protein structures. Algorithmically, SINATRA Pro works by first taking in the 3D atomic coordinates for each protein snapshot and summarizing them according to their underlying topology. Statistically significant topological features are then projected back onto a user-selected representative protein structure, thus facilitating the visual identification of biophysical signatures of different protein ensembles. We assess the ability of SINATRA Pro to detect minute conformational changes in five independent protein systems of varying complexities. In all test cases, SINATRA Pro identifies known structural features that have been validated by previous experimental and computational studies, as well as novel features that are also likely to be biologically-relevant according to the literature. These results highlight SINATRA Pro as a promising method for facilitating the non-trivial task of pattern recognition in trajectories resulting from molecular dynamics simulations, with substantially increased resolution.


Assuntos
Ciência de Dados , Simulação de Dinâmica Molecular , Biofísica , Conformação Proteica , Proteínas/química
9.
PLoS Genet ; 16(6): e1008855, 2020 06.
Artigo em Inglês | MEDLINE | ID: mdl-32542026

RESUMO

Traditional univariate genome-wide association studies generate false positives and negatives due to difficulties distinguishing associated variants from variants with spurious nonzero effects that do not directly influence the trait. Recent efforts have been directed at identifying genes or signaling pathways enriched for mutations in quantitative traits or case-control studies, but these can be computationally costly and hampered by strict model assumptions. Here, we present gene-ε, a new approach for identifying statistical associations between sets of variants and quantitative traits. Our key insight is that enrichment studies on the gene-level are improved when we reformulate the genome-wide SNP-level null hypothesis to identify spurious small-to-intermediate SNP effects and classify them as non-causal. gene-ε efficiently identifies enriched genes under a variety of simulated genetic architectures, achieving greater than a 90% true positive rate at 1% false positive rate for polygenic traits. Lastly, we apply gene-ε to summary statistics derived from six quantitative traits using European-ancestry individuals in the UK Biobank, and identify enriched genes that are in biologically relevant pathways.


Assuntos
Estudo de Associação Genômica Ampla/estatística & dados numéricos , Modelos Genéticos , Herança Multifatorial/genética , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas/genética , Interpretação Estatística de Dados , Bases de Dados Genéticas/estatística & dados numéricos , Humanos , Reino Unido , População Branca/genética
10.
PLoS Genet ; 15(2): e1007978, 2019 02.
Artigo em Inglês | MEDLINE | ID: mdl-30735486

RESUMO

Linear mixed effect models are powerful tools used to account for population structure in genome-wide association studies (GWASs) and estimate the genetic architecture of complex traits. However, fully-specified models are computationally demanding and common simplifications often lead to reduced power or biased inference. We describe Grid-LMM (https://github.com/deruncie/GridLMM), an extendable algorithm for repeatedly fitting complex linear models that account for multiple sources of heterogeneity, such as additive and non-additive genetic variance, spatial heterogeneity, and genotype-environment interactions. Grid-LMM can compute approximate (yet highly accurate) frequentist test statistics or Bayesian posterior summaries at a genome-wide scale in a fraction of the time compared to existing general-purpose methods. We apply Grid-LMM to two types of quantitative genetic analyses. The first is focused on accounting for spatial variability and non-additive genetic variance while scanning for QTL; and the second aims to identify gene expression traits affected by non-additive genetic variation. In both cases, modeling multiple sources of heterogeneity leads to new discoveries.


Assuntos
Algoritmos , Modelos Lineares , Modelos Genéticos , Animais , Arabidopsis/genética , Arabidopsis/crescimento & desenvolvimento , Teorema de Bayes , Peso Corporal/genética , Simulação por Computador , Flores/genética , Flores/crescimento & desenvolvimento , Interação Gene-Ambiente , Marcadores Genéticos , Variação Genética , Estudo de Associação Genômica Ampla/estatística & dados numéricos , Humanos , Camundongos , Locos de Características Quantitativas
11.
PLoS Genet ; 13(7): e1006869, 2017 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-28746338

RESUMO

Epistasis, commonly defined as the interaction between multiple genes, is an important genetic component underlying phenotypic variation. Many statistical methods have been developed to model and identify epistatic interactions between genetic variants. However, because of the large combinatorial search space of interactions, most epistasis mapping methods face enormous computational challenges and often suffer from low statistical power due to multiple test correction. Here, we present a novel, alternative strategy for mapping epistasis: instead of directly identifying individual pairwise or higher-order interactions, we focus on mapping variants that have non-zero marginal epistatic effects-the combined pairwise interaction effects between a given variant and all other variants. By testing marginal epistatic effects, we can identify candidate variants that are involved in epistasis without the need to identify the exact partners with which the variants interact, thus potentially alleviating much of the statistical and computational burden associated with standard epistatic mapping procedures. Our method is based on a variance component model, and relies on a recently developed variance component estimation method for efficient parameter inference and p-value computation. We refer to our method as the "MArginal ePIstasis Test", or MAPIT. With simulations, we show how MAPIT can be used to estimate and test marginal epistatic effects, produce calibrated test statistics under the null, and facilitate the detection of pairwise epistatic interactions. We further illustrate the benefits of MAPIT in a QTL mapping study by analyzing the gene expression data of over 400 individuals from the GEUVADIS consortium.


Assuntos
Mapeamento Cromossômico/estatística & dados numéricos , Epistasia Genética , Modelos Genéticos , Locos de Características Quantitativas/genética , Algoritmos , Simulação por Computador , Regulação da Expressão Gênica , Estudo de Associação Genômica Ampla/estatística & dados numéricos , Humanos , Fenótipo
12.
Am J Physiol Cell Physiol ; 317(2): C155-C166, 2019 08 01.
Artigo em Inglês | MEDLINE | ID: mdl-30917031

RESUMO

Many different subpopulations of subcellular extracellular vesicles (EVs) have been described. EVs are released from all cell types and have been shown to regulate normal physiological homeostasis, as well as pathological states by influencing cell proliferation, differentiation, organ homing, injury and recovery, as well as disease progression. In this review, we focus on the bidirectional actions of vesicles from normal and diseased cells on normal or leukemic target cells; and on the leukemic microenvironment as a whole. EVs from human bone marrow mesenchymal stem cells (MSC) can have a healing effect, reversing the malignant phenotype in prostate and colorectal cancer, as well as mitigating radiation damage to marrow. The role of EVs in leukemia and their bimodal cross talk with the encompassing microenvironment remains to be fully characterized. This may provide insight for clinical advances via the application of EVs as potential therapy and the employment of statistical and machine learning models to capture the pleiotropic effects EVs endow to a dynamic microenvironment, possibly allowing for precise therapeutic intervention.


Assuntos
Biomarcadores Tumorais/metabolismo , Vesículas Extracelulares/metabolismo , Leucemia/metabolismo , Células-Tronco Mesenquimais/metabolismo , Células-Tronco Neoplásicas/metabolismo , Microambiente Tumoral , Animais , Antineoplásicos/uso terapêutico , Biomarcadores Tumorais/genética , Comunicação Celular , Resistencia a Medicamentos Antineoplásicos , Vesículas Extracelulares/efeitos dos fármacos , Vesículas Extracelulares/genética , Vesículas Extracelulares/patologia , Humanos , Leucemia/tratamento farmacológico , Leucemia/genética , Leucemia/patologia , Aprendizado de Máquina , Células-Tronco Mesenquimais/efeitos dos fármacos , Células-Tronco Mesenquimais/patologia , Células-Tronco Neoplásicas/efeitos dos fármacos , Células-Tronco Neoplásicas/patologia , Fenótipo , Transdução de Sinais , Biologia de Sistemas/métodos
13.
Elife ; 132024 Jun 24.
Artigo em Inglês | MEDLINE | ID: mdl-38913556

RESUMO

LD score regression (LDSC) is a method to estimate narrow-sense heritability from genome-wide association study (GWAS) summary statistics alone, making it a fast and popular approach. In this work, we present interaction-LD score (i-LDSC) regression: an extension of the original LDSC framework that accounts for interactions between genetic variants. By studying a wide range of generative models in simulations, and by re-analyzing 25 well-studied quantitative phenotypes from 349,468 individuals in the UK Biobank and up to 159,095 individuals in BioBank Japan, we show that the inclusion of a cis-interaction score (i.e. interactions between a focal variant and proximal variants) recovers genetic variance that is not captured by LDSC. For each of the 25 traits analyzed in the UK Biobank and BioBank Japan, i-LDSC detects additional variation contributed by genetic interactions. The i-LDSC software and its application to these biobanks represent a step towards resolving further genetic contributions of sources of non-additive genetic effects to complex trait variation.


Assuntos
Estudo de Associação Genômica Ampla , Estudo de Associação Genômica Ampla/métodos , Humanos , Japão , Reino Unido , Polimorfismo de Nucleotídeo Único/genética , Modelos Genéticos , Fenótipo , Variação Genética , Herança Multifatorial/genética , Bancos de Espécimes Biológicos
14.
bioRxiv ; 2024 Jan 11.
Artigo em Inglês | MEDLINE | ID: mdl-38260340

RESUMO

Understanding morphological variation is an important task in many areas of computational biology. Recent studies have focused on developing computational tools for the task of sub-image selection which aims at identifying structural features that best describe the variation between classes of shapes. A major part in assessing the utility of these approaches is to demonstrate their performance on both simulated and real datasets. However, when creating a model for shape statistics, real data can be difficult to access and the sample sizes for these data are often small due to them being expensive to collect. Meanwhile, the current landscape of generative models for shapes has been mostly limited to approaches that use black-box inference-making it difficult to systematically assess the power and calibration of sub-image models. In this paper, we introduce the α-shape sampler: a probabilistic framework for generating realistic 2D and 3D shapes based on probability distributions which can be learned from real data. We demonstrate our framework using proof-of-concept examples and in two real applications in biology where we generate (i) 2D images of healthy and septic neutrophils and (ii) 3D computed tomography (CT) scans of primate mandibular molars. The α-shape sampler R package is open-source and can be downloaded at https://github.com/lcrawlab/ashapesampler.

15.
bioRxiv ; 2024 Feb 12.
Artigo em Inglês | MEDLINE | ID: mdl-38405697

RESUMO

Clustering is commonly used in single-cell RNA-sequencing (scRNA-seq) pipelines to characterize cellular heterogeneity. However, current methods face two main limitations. First, they require user-specified heuristics which add time and complexity to bioinformatic workflows; second, they rely on post-selective differential expression analyses to identify marker genes driving cluster differences, which has been shown to be subject to inflated false discovery rates. We address these challenges by introducing nonparametric clustering of single-cell populations (NCLUSION): an infinite mixture model that leverages Bayesian sparse priors to identify marker genes while simultaneously performing clustering on single-cell expression data. NCLUSION uses a scalable variational inference algorithm to perform these analyses on datasets with up to millions of cells. By analyzing publicly available scRNA-seq studies, we demonstrate that NCLUSION (i) matches the performance of other state-of-the-art clustering techniques with significantly reduced runtime and (ii) provides statistically robust and biologically relevant transcriptomic signatures for each of the clusters it identifies. Overall, NCLUSION represents a reliable hypothesis-generating tool for understanding patterns of expression variation present in single-cell populations.

16.
bioRxiv ; 2024 Jun 10.
Artigo em Inglês | MEDLINE | ID: mdl-38915726

RESUMO

Efforts to cure BCR::ABL1 B cell acute lymphoblastic leukemia (Ph+ ALL) solely through inhibition of ABL1 kinase activity have thus far been insufficient despite the availability of tyrosine kinase inhibitors (TKIs) with broad activity against resistance mutants. The mechanisms that drive persistence within minimal residual disease (MRD) remain poorly understood and therefore untargeted. Utilizing 13 patient-derived xenograft (PDX) models and clinical trial specimens of Ph+ ALL, we examined how genetic and transcriptional features co-evolve to drive progression during prolonged TKI response. Our work reveals a landscape of cooperative mutational and transcriptional escape mechanisms that differ from those causing resistance to first generation TKIs. By analyzing MRD during remission, we show that the same resistance mutation can either increase or decrease cellular fitness depending on transcriptional state. We further demonstrate that directly targeting transcriptional state-associated vulnerabilities at MRD can overcome BCR::ABL1 independence, suggesting a new paradigm for rationally eradicating MRD prior to relapse. Finally, we illustrate how cell mass measurements of leukemia cells can be used to rapidly monitor dominant transcriptional features of Ph+ ALL to help rationally guide therapeutic selection from low-input samples.

17.
G3 (Bethesda) ; 13(8)2023 08 09.
Artigo em Inglês | MEDLINE | ID: mdl-37243672

RESUMO

Epistasis, commonly defined as the interaction between genetic loci, is known to play an important role in the phenotypic variation of complex traits. As a result, many statistical methods have been developed to identify genetic variants that are involved in epistasis, and nearly all of these approaches carry out this task by focusing on analyzing one trait at a time. Previous studies have shown that jointly modeling multiple phenotypes can often dramatically increase statistical power for association mapping. In this study, we present the "multivariate MArginal ePIstasis Test" (mvMAPIT)-a multioutcome generalization of a recently proposed epistatic detection method which seeks to detect marginal epistasis or the combined pairwise interaction effects between a given variant and all other variants. By searching for marginal epistatic effects, one can identify genetic variants that are involved in epistasis without the need to identify the exact partners with which the variants interact-thus, potentially alleviating much of the statistical and computational burden associated with conventional explicit search-based methods. Our proposed mvMAPIT builds upon this strategy by taking advantage of correlation structure between traits to improve the identification of variants involved in epistasis. We formulate mvMAPIT as a multivariate linear mixed model and develop a multitrait variance component estimation algorithm for efficient parameter inference and P-value computation. Together with reasonable model approximations, our proposed approach is scalable to moderately sized genome-wide association studies. With simulations, we illustrate the benefits of mvMAPIT over univariate (or single-trait) epistatic mapping strategies. We also apply mvMAPIT framework to protein sequence data from two broadly neutralizing anti-influenza antibodies and approximately 2,000 heterogeneous stock of mice from the Wellcome Trust Centre for Human Genetics. The mvMAPIT R package can be downloaded at https://github.com/lcrawlab/mvMAPIT.


Assuntos
Epistasia Genética , Estudo de Associação Genômica Ampla , Humanos , Animais , Camundongos , Fenótipo , Locos de Características Quantitativas , Algoritmos
18.
iScience ; 25(7): 104553, 2022 Jul 15.
Artigo em Inglês | MEDLINE | ID: mdl-35769876

RESUMO

In this paper, we propose a new approach for variable selection using a collection of Bayesian neural networks with a focus on quantifying uncertainty over which variables are selected. Motivated by fine-mapping applications in statistical genetics, we refer to our framework as an "ensemble of single-effect neural networks" (ESNN) which generalizes the "sum of single effects" regression framework by both accounting for nonlinear structure in genotypic data (e.g., dominance effects) and having the capability to model discrete phenotypes (e.g., case-control studies). Through extensive simulations, we demonstrate our method's ability to produce calibrated posterior summaries such as credible sets and posterior inclusion probabilities, particularly for traits with genetic architectures that have significant proportions of non-additive variation driven by correlated variants. Lastly, we use real data to demonstrate that the ESNN framework improves upon the state of the art for identifying true effect variables underlying various complex traits.

19.
Genome Biol ; 22(1): 213, 2021 07 23.
Artigo em Inglês | MEDLINE | ID: mdl-34301310

RESUMO

Large-scale phenotype data can enhance the power of genomic prediction in plant and animal breeding, as well as human genetics. However, the statistical foundation of multi-trait genomic prediction is based on the multivariate linear mixed effect model, a tool notorious for its fragility when applied to more than a handful of traits. We present MegaLMM, a statistical framework and associated software package for mixed model analyses of a virtually unlimited number of traits. Using three examples with real plant data, we show that MegaLMM can leverage thousands of traits at once to significantly improve genetic value prediction accuracy.


Assuntos
Arabidopsis/genética , Genoma de Planta , Modelos Genéticos , Característica Quantitativa Herdável , Software , Triticum/genética , Zea mays/genética , Teorema de Bayes , Interação Gene-Ambiente , Genômica , Genótipo , Humanos , Fenótipo , Melhoramento Vegetal
20.
Mol Cancer Ther ; 20(1): 183-190, 2021 01.
Artigo em Inglês | MEDLINE | ID: mdl-33087512

RESUMO

Glycogen synthase kinase-3ß (GSK-3ß), a serine/threonine kinase, has been implicated in the pathogenesis of many cancers, with involvement in cell-cycle regulation, apoptosis, and immune response. Small-molecule GSK-3ß inhibitors are currently undergoing clinical investigation. Tumor sequencing has revealed genomic alterations in GSK-3ß, yet an assessment of the genomic landscape in malignancies is lacking. This study assessed >100,000 tumors from two databases to analyze GSK-3ß alterations. GSK-3ß expression and immune cell infiltrate data were analyzed across cancer types, and programmed death-ligand 1 (PD-L1) expression was compared between GSK-3ß-mutated and wild-type tumors. GSK-3ß was mutated at a rate of 1%. The majority of mutated residues were in the kinase domain, with frequent mutations occurring in a GSK-3ß substrate binding pocket. Uterine endometrioid carcinoma was the most commonly mutated (4%) tumor, and copy-number variations were most commonly observed in squamous histologies. Significant differences across cancer types for GSK-3ß-mutated tumors were observed for B cells (P = 0.018), monocytes (P = 0.002), dendritic cells (P = 0.005), neutrophils (P = 0.0003), and endothelial cells (P = 0.014). GSK-3ß mRNA expression was highest in melanoma. The frequency of PD-L1 expression was higher among GSK-3ß-mutated tumors compared with wild type in colorectal cancer (P = 0.03), endometrial cancer (P = 0.05), melanoma (P = 0.02), ovarian carcinoma (P = 0.0001), and uterine sarcoma (P = 0.002). Overall, GSK-3ß molecular alterations were detected in approximately 1% of solid tumors, tumors with GSK-3ß mutations displayed a microenvironment with increased infiltration of B cells, and GSK-3ß mutations were associated with increased PD-L1 expression in selected histologies. These results advance the understanding of GSK-3ß complex signaling network interfacing with key pathways involved in carcinogenesis and immune response.


Assuntos
Genoma Humano , Glicogênio Sintase Quinase 3 beta/metabolismo , Neoplasias/enzimologia , Neoplasias/genética , Antígeno B7-H1/metabolismo , Estudos de Coortes , Variações do Número de Cópias de DNA/genética , Glicogênio Sintase Quinase 3 beta/genética , Humanos , Mutação/genética , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Microambiente Tumoral/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA