Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 30
Filtrar
Más filtros

Banco de datos
Tipo del documento
País de afiliación
Intervalo de año de publicación
1.
Cell ; 184(25): 6119-6137.e26, 2021 12 09.
Artículo en Inglés | MEDLINE | ID: mdl-34890551

RESUMEN

Prognostically relevant RNA expression states exist in pancreatic ductal adenocarcinoma (PDAC), but our understanding of their drivers, stability, and relationship to therapeutic response is limited. To examine these attributes systematically, we profiled metastatic biopsies and matched organoid models at single-cell resolution. In vivo, we identify a new intermediate PDAC transcriptional cell state and uncover distinct site- and state-specific tumor microenvironments (TMEs). Benchmarking models against this reference map, we reveal strong culture-specific biases in cancer cell transcriptional state representation driven by altered TME signals. We restore expression state heterogeneity by adding back in vivo-relevant factors and show plasticity in culture models. Further, we prove that non-genetic modulation of cell state can strongly influence drug responses, uncovering state-specific vulnerabilities. This work provides a broadly applicable framework for aligning cell states across in vivo and ex vivo settings, identifying drivers of transcriptional plasticity and manipulating cell state to target associated vulnerabilities.


Asunto(s)
Biomarcadores de Tumor/metabolismo , Carcinoma Ductal Pancreático/metabolismo , Neoplasias Pancreáticas/metabolismo , Microambiente Tumoral , Adulto , Anciano , Línea Celular Tumoral , Femenino , Regulación Neoplásica de la Expresión Génica , Humanos , Masculino , Persona de Mediana Edad , Análisis de la Célula Individual
2.
Am J Hum Genet ; 109(5): 871-884, 2022 05 05.
Artículo en Inglés | MEDLINE | ID: mdl-35349783

RESUMEN

Since 2005, genome-wide association (GWA) datasets have been largely biased toward sampling European ancestry individuals, and recent studies have shown that GWA results estimated from self-identified European individuals are not transferable to non-European individuals because of various confounding challenges. Here, we demonstrate that enrichment analyses that aggregate SNP-level association statistics at multiple genomic scales-from genes to genomic regions and pathways-have been underutilized in the GWA era and can generate biologically interpretable hypotheses regarding the genetic basis of complex trait architecture. We illustrate examples of the robust associations generated by enrichment analyses while studying 25 continuous traits assayed in 566,786 individuals from seven diverse self-identified human ancestries in the UK Biobank and the Biobank Japan as well as 44,348 admixed individuals from the PAGE consortium including cohorts of African American, Hispanic and Latin American, Native Hawaiian, and American Indian/Alaska Native individuals. We identify 1,000 gene-level associations that are genome-wide significant in at least two ancestry cohorts across these 25 traits as well as highly conserved pathway associations with triglyceride levels in European, East Asian, and Native Hawaiian cohorts.


Asunto(s)
Estudio de Asociación del Genoma Completo , Polimorfismo de Nucleótido Simple , Estudio de Asociación del Genoma Completo/métodos , Humanos , Herencia Multifactorial , Fenotipo , Polimorfismo de Nucleótido Simple/genética , Grupos Raciales
3.
PLoS Comput Biol ; 20(9): e1012469, 2024 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-39288189

RESUMEN

Significant variations have been observed in viral copies generated during SARS-CoV-2 infections. However, the factors that impact viral copies and infection dynamics are not fully understood, and may be inherently dependent upon different viral and host factors. Here, we conducted virus whole genome sequencing and measured viral copies using RT-qPCR from 9,902 SARS-CoV-2 infections over a 2-year period to examine the impact of virus genetic variation on changes in viral copies adjusted for host age and vaccination status. Using a genome-wide association study (GWAS) approach, we identified multiple single-nucleotide polymorphisms (SNPs) corresponding to amino acid changes in the SARS-CoV-2 genome associated with variations in viral copies. We further applied a marginal epistasis test to detect interactions among SNPs and identified multiple pairs of substitutions located in the spike gene that have non-linear effects on viral copies. We also analyzed the temporal patterns and found that SNPs associated with increased viral copies were predominantly observed in Delta and Omicron BA.2/BA.4/BA.5/XBB infections, whereas those associated with decreased viral copies were only observed in infections with Omicron BA.1 variants. Our work showcases how GWAS can be a useful tool for probing phenotypes related to SNPs in viral genomes that are worth further exploration. We argue that this approach can be used more broadly across pathogens to characterize emerging variants and monitor therapeutic interventions.


Asunto(s)
COVID-19 , Genoma Viral , Estudio de Asociación del Genoma Completo , Polimorfismo de Nucleótido Simple , SARS-CoV-2 , Polimorfismo de Nucleótido Simple/genética , Humanos , SARS-CoV-2/genética , Estudio de Asociación del Genoma Completo/métodos , COVID-19/genética , COVID-19/virología , Genoma Viral/genética , Glicoproteína de la Espiga del Coronavirus/genética , Persona de Mediana Edad , Adulto , Masculino , Femenino , Carga Viral/genética , Anciano , Secuenciación Completa del Genoma/métodos
4.
BMC Bioinformatics ; 25(1): 249, 2024 Jul 30.
Artículo en Inglés | MEDLINE | ID: mdl-39080561

RESUMEN

In this paper, we aim to build a platform that will help bridge the gap between high-dimensional computation and wet-lab experimentation by allowing users to interrogate genomic signatures at multiple molecular levels and identify best next actionable steps for downstream decision making. We introduce Multioviz: a publicly accessible R package and web application platform to easily perform in silico hypothesis testing of generated gene regulatory networks. We demonstrate the utility of Multioviz by conducting an end-to-end analysis in a statistical genetics application focused on measuring the effect of in silico perturbations of complex trait architecture. By using a real dataset from the Wellcome Trust Centre for Human Genetics, we both recapitulate previous findings and propose hypotheses about the genes involved in the percentage of immune CD8+ cells found in heterogeneous stocks of mice. Source code for the Multioviz R package is available at https://github.com/lcrawlab/multio-viz and an interactive version of the platform is available at https://multioviz.ccv.brown.edu/ .


Asunto(s)
Redes Reguladoras de Genes , Programas Informáticos , Ratones , Animales , Simulación por Computador , Humanos , Biología Computacional/métodos
5.
PLoS Comput Biol ; 19(5): e1011162, 2023 05.
Artículo en Inglés | MEDLINE | ID: mdl-37220151

RESUMEN

Natural products are chemical compounds that form the basis of many therapeutics used in the pharmaceutical industry. In microbes, natural products are synthesized by groups of colocalized genes called biosynthetic gene clusters (BGCs). With advances in high-throughput sequencing, there has been an increase of complete microbial isolate genomes and metagenomes, from which a vast number of BGCs are undiscovered. Here, we introduce a self-supervised learning approach designed to identify and characterize BGCs from such data. To do this, we represent BGCs as chains of functional protein domains and train a masked language model on these domains. We assess the ability of our approach to detect BGCs and characterize BGC properties in bacterial genomes. We also demonstrate that our model can learn meaningful representations of BGCs and their constituent domains, detect BGCs in microbial genomes, and predict BGC product classes. These results highlight self-supervised neural networks as a promising framework for improving BGC prediction and classification.


Asunto(s)
Productos Biológicos , Genoma Bacteriano , Metagenoma , Familia de Multigenes/genética , Productos Biológicos/metabolismo , Aprendizaje Automático Supervisado
6.
PLoS Genet ; 17(8): e1009754, 2021 08.
Artículo en Inglés | MEDLINE | ID: mdl-34411094

RESUMEN

In this article, we present Biologically Annotated Neural Networks (BANNs), a nonlinear probabilistic framework for association mapping in genome-wide association (GWA) studies. BANNs are feedforward models with partially connected architectures that are based on biological annotations. This setup yields a fully interpretable neural network where the input layer encodes SNP-level effects, and the hidden layer models the aggregated effects among SNP-sets. We treat the weights and connections of the network as random variables with prior distributions that reflect how genetic effects manifest at different genomic scales. The BANNs software uses variational inference to provide posterior summaries which allow researchers to simultaneously perform (i) mapping with SNPs and (ii) enrichment analyses with SNP-sets on complex traits. Through simulations, we show that our method improves upon state-of-the-art association mapping and enrichment approaches across a wide range of genetic architectures. We then further illustrate the benefits of BANNs by analyzing real GWA data assayed in approximately 2,000 heterogenous stock of mice from the Wellcome Trust Centre for Human Genetics and approximately 7,000 individuals from the Framingham Heart Study. Lastly, using a random subset of individuals of European ancestry from the UK Biobank, we show that BANNs is able to replicate known associations in high and low-density lipoprotein cholesterol content.


Asunto(s)
Estudio de Asociación del Genoma Completo/métodos , Anotación de Secuencia Molecular/métodos , Animales , Genoma/genética , Genómica/métodos , Genotipo , Humanos , Modelos Genéticos , Herencia Multifactorial/genética , Redes Neurales de la Computación , Fenotipo , Polimorfismo de Nucleótido Simple/genética , Programas Informáticos
7.
PLoS Genet ; 17(3): e1008887, 2021 03.
Artículo en Inglés | MEDLINE | ID: mdl-33735180

RESUMEN

The winged insects of the order Diptera are colloquially named for their most recognizable phenotype: flight. These insects rely on flight for a number of important life history traits, such as dispersal, foraging, and courtship. Despite the importance of flight, relatively little is known about the genetic architecture of flight performance. Accordingly, we sought to uncover the genetic modifiers of flight using a measure of flies' reaction and response to an abrupt drop in a vertical flight column. We conducted a genome wide association study (GWAS) using 197 of the Drosophila Genetic Reference Panel (DGRP) lines, and identified a combination of additive and marginal variants, epistatic interactions, whole genes, and enrichment across interaction networks. Egfr, a highly pleiotropic developmental gene, was among the most significant additive variants identified. We functionally validated 13 of the additive candidate genes' (Adgf-A/Adgf-A2/CG32181, bru1, CadN, flapper (CG11073), CG15236, flippy (CG9766), CREG, Dscam4, form3, fry, Lasp/CG9692, Pde6, Snoo), and introduce a novel approach to whole gene significance screens: PEGASUS_flies. Additionally, we identified ppk23, an Acid Sensing Ion Channel (ASIC) homolog, as an important hub for epistatic interactions. We propose a model that suggests genetic modifiers of wing and muscle morphology, nervous system development and function, BMP signaling, sexually dimorphic neural wiring, and gene regulation are all important for the observed differences flight performance in a natural population. Additionally, these results represent a snapshot of the genetic modifiers affecting drop-response flight performance in Drosophila, with implications for other insects.


Asunto(s)
Drosophila melanogaster/genética , Drosophila/genética , Regulación del Desarrollo de la Expresión Génica , Variación Genética , Neurogénesis/genética , Animales , Drosophila/embriología , Drosophila melanogaster/metabolismo , Epigénesis Genética , Femenino , Vuelo Animal , Estudios de Asociación Genética , Masculino , Fenotipo , Polimorfismo de Nucleótido Simple
8.
PLoS Comput Biol ; 18(5): e1010045, 2022 05.
Artículo en Inglés | MEDLINE | ID: mdl-35500014

RESUMEN

Identifying structural differences among proteins can be a non-trivial task. When contrasting ensembles of protein structures obtained from molecular dynamics simulations, biologically-relevant features can be easily overshadowed by spurious fluctuations. Here, we present SINATRA Pro, a computational pipeline designed to robustly identify topological differences between two sets of protein structures. Algorithmically, SINATRA Pro works by first taking in the 3D atomic coordinates for each protein snapshot and summarizing them according to their underlying topology. Statistically significant topological features are then projected back onto a user-selected representative protein structure, thus facilitating the visual identification of biophysical signatures of different protein ensembles. We assess the ability of SINATRA Pro to detect minute conformational changes in five independent protein systems of varying complexities. In all test cases, SINATRA Pro identifies known structural features that have been validated by previous experimental and computational studies, as well as novel features that are also likely to be biologically-relevant according to the literature. These results highlight SINATRA Pro as a promising method for facilitating the non-trivial task of pattern recognition in trajectories resulting from molecular dynamics simulations, with substantially increased resolution.


Asunto(s)
Ciencia de los Datos , Simulación de Dinámica Molecular , Biofisica , Conformación Proteica , Proteínas/química
9.
PLoS Genet ; 16(6): e1008855, 2020 06.
Artículo en Inglés | MEDLINE | ID: mdl-32542026

RESUMEN

Traditional univariate genome-wide association studies generate false positives and negatives due to difficulties distinguishing associated variants from variants with spurious nonzero effects that do not directly influence the trait. Recent efforts have been directed at identifying genes or signaling pathways enriched for mutations in quantitative traits or case-control studies, but these can be computationally costly and hampered by strict model assumptions. Here, we present gene-ε, a new approach for identifying statistical associations between sets of variants and quantitative traits. Our key insight is that enrichment studies on the gene-level are improved when we reformulate the genome-wide SNP-level null hypothesis to identify spurious small-to-intermediate SNP effects and classify them as non-causal. gene-ε efficiently identifies enriched genes under a variety of simulated genetic architectures, achieving greater than a 90% true positive rate at 1% false positive rate for polygenic traits. Lastly, we apply gene-ε to summary statistics derived from six quantitative traits using European-ancestry individuals in the UK Biobank, and identify enriched genes that are in biologically relevant pathways.


Asunto(s)
Estudio de Asociación del Genoma Completo/estadística & datos numéricos , Modelos Genéticos , Herencia Multifactorial/genética , Polimorfismo de Nucleótido Simple , Sitios de Carácter Cuantitativo/genética , Interpretación Estadística de Datos , Bases de Datos Genéticas/estadística & datos numéricos , Humanos , Reino Unido , Población Blanca/genética
10.
PLoS Genet ; 15(2): e1007978, 2019 02.
Artículo en Inglés | MEDLINE | ID: mdl-30735486

RESUMEN

Linear mixed effect models are powerful tools used to account for population structure in genome-wide association studies (GWASs) and estimate the genetic architecture of complex traits. However, fully-specified models are computationally demanding and common simplifications often lead to reduced power or biased inference. We describe Grid-LMM (https://github.com/deruncie/GridLMM), an extendable algorithm for repeatedly fitting complex linear models that account for multiple sources of heterogeneity, such as additive and non-additive genetic variance, spatial heterogeneity, and genotype-environment interactions. Grid-LMM can compute approximate (yet highly accurate) frequentist test statistics or Bayesian posterior summaries at a genome-wide scale in a fraction of the time compared to existing general-purpose methods. We apply Grid-LMM to two types of quantitative genetic analyses. The first is focused on accounting for spatial variability and non-additive genetic variance while scanning for QTL; and the second aims to identify gene expression traits affected by non-additive genetic variation. In both cases, modeling multiple sources of heterogeneity leads to new discoveries.


Asunto(s)
Algoritmos , Modelos Lineales , Modelos Genéticos , Animales , Arabidopsis/genética , Arabidopsis/crecimiento & desarrollo , Teorema de Bayes , Peso Corporal/genética , Simulación por Computador , Flores/genética , Flores/crecimiento & desarrollo , Interacción Gen-Ambiente , Marcadores Genéticos , Variación Genética , Estudio de Asociación del Genoma Completo/estadística & datos numéricos , Humanos , Ratones , Sitios de Carácter Cuantitativo
11.
PLoS Genet ; 13(7): e1006869, 2017 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-28746338

RESUMEN

Epistasis, commonly defined as the interaction between multiple genes, is an important genetic component underlying phenotypic variation. Many statistical methods have been developed to model and identify epistatic interactions between genetic variants. However, because of the large combinatorial search space of interactions, most epistasis mapping methods face enormous computational challenges and often suffer from low statistical power due to multiple test correction. Here, we present a novel, alternative strategy for mapping epistasis: instead of directly identifying individual pairwise or higher-order interactions, we focus on mapping variants that have non-zero marginal epistatic effects-the combined pairwise interaction effects between a given variant and all other variants. By testing marginal epistatic effects, we can identify candidate variants that are involved in epistasis without the need to identify the exact partners with which the variants interact, thus potentially alleviating much of the statistical and computational burden associated with standard epistatic mapping procedures. Our method is based on a variance component model, and relies on a recently developed variance component estimation method for efficient parameter inference and p-value computation. We refer to our method as the "MArginal ePIstasis Test", or MAPIT. With simulations, we show how MAPIT can be used to estimate and test marginal epistatic effects, produce calibrated test statistics under the null, and facilitate the detection of pairwise epistatic interactions. We further illustrate the benefits of MAPIT in a QTL mapping study by analyzing the gene expression data of over 400 individuals from the GEUVADIS consortium.


Asunto(s)
Mapeo Cromosómico/estadística & datos numéricos , Epistasis Genética , Modelos Genéticos , Sitios de Carácter Cuantitativo/genética , Algoritmos , Simulación por Computador , Regulación de la Expresión Génica , Estudio de Asociación del Genoma Completo/estadística & datos numéricos , Humanos , Fenotipo
12.
Am J Physiol Cell Physiol ; 317(2): C155-C166, 2019 08 01.
Artículo en Inglés | MEDLINE | ID: mdl-30917031

RESUMEN

Many different subpopulations of subcellular extracellular vesicles (EVs) have been described. EVs are released from all cell types and have been shown to regulate normal physiological homeostasis, as well as pathological states by influencing cell proliferation, differentiation, organ homing, injury and recovery, as well as disease progression. In this review, we focus on the bidirectional actions of vesicles from normal and diseased cells on normal or leukemic target cells; and on the leukemic microenvironment as a whole. EVs from human bone marrow mesenchymal stem cells (MSC) can have a healing effect, reversing the malignant phenotype in prostate and colorectal cancer, as well as mitigating radiation damage to marrow. The role of EVs in leukemia and their bimodal cross talk with the encompassing microenvironment remains to be fully characterized. This may provide insight for clinical advances via the application of EVs as potential therapy and the employment of statistical and machine learning models to capture the pleiotropic effects EVs endow to a dynamic microenvironment, possibly allowing for precise therapeutic intervention.


Asunto(s)
Biomarcadores de Tumor/metabolismo , Vesículas Extracelulares/metabolismo , Leucemia/metabolismo , Células Madre Mesenquimatosas/metabolismo , Células Madre Neoplásicas/metabolismo , Microambiente Tumoral , Animales , Antineoplásicos/uso terapéutico , Biomarcadores de Tumor/genética , Comunicación Celular , Resistencia a Antineoplásicos , Vesículas Extracelulares/efectos de los fármacos , Vesículas Extracelulares/genética , Vesículas Extracelulares/patología , Humanos , Leucemia/tratamiento farmacológico , Leucemia/genética , Leucemia/patología , Aprendizaje Automático , Células Madre Mesenquimatosas/efectos de los fármacos , Células Madre Mesenquimatosas/patología , Células Madre Neoplásicas/efectos de los fármacos , Células Madre Neoplásicas/patología , Fenotipo , Transducción de Señal , Biología de Sistemas/métodos
13.
Elife ; 132024 Jun 24.
Artículo en Inglés | MEDLINE | ID: mdl-38913556

RESUMEN

LD score regression (LDSC) is a method to estimate narrow-sense heritability from genome-wide association study (GWAS) summary statistics alone, making it a fast and popular approach. In this work, we present interaction-LD score (i-LDSC) regression: an extension of the original LDSC framework that accounts for interactions between genetic variants. By studying a wide range of generative models in simulations, and by re-analyzing 25 well-studied quantitative phenotypes from 349,468 individuals in the UK Biobank and up to 159,095 individuals in BioBank Japan, we show that the inclusion of a cis-interaction score (i.e. interactions between a focal variant and proximal variants) recovers genetic variance that is not captured by LDSC. For each of the 25 traits analyzed in the UK Biobank and BioBank Japan, i-LDSC detects additional variation contributed by genetic interactions. The i-LDSC software and its application to these biobanks represent a step towards resolving further genetic contributions of sources of non-additive genetic effects to complex trait variation.


Asunto(s)
Estudio de Asociación del Genoma Completo , Estudio de Asociación del Genoma Completo/métodos , Humanos , Japón , Reino Unido , Polimorfismo de Nucleótido Simple/genética , Modelos Genéticos , Fenotipo , Variación Genética , Herencia Multifactorial/genética , Bancos de Muestras Biológicas
14.
bioRxiv ; 2024 Jan 11.
Artículo en Inglés | MEDLINE | ID: mdl-38260340

RESUMEN

Understanding morphological variation is an important task in many areas of computational biology. Recent studies have focused on developing computational tools for the task of sub-image selection which aims at identifying structural features that best describe the variation between classes of shapes. A major part in assessing the utility of these approaches is to demonstrate their performance on both simulated and real datasets. However, when creating a model for shape statistics, real data can be difficult to access and the sample sizes for these data are often small due to them being expensive to collect. Meanwhile, the current landscape of generative models for shapes has been mostly limited to approaches that use black-box inference-making it difficult to systematically assess the power and calibration of sub-image models. In this paper, we introduce the α-shape sampler: a probabilistic framework for generating realistic 2D and 3D shapes based on probability distributions which can be learned from real data. We demonstrate our framework using proof-of-concept examples and in two real applications in biology where we generate (i) 2D images of healthy and septic neutrophils and (ii) 3D computed tomography (CT) scans of primate mandibular molars. The α-shape sampler R package is open-source and can be downloaded at https://github.com/lcrawlab/ashapesampler.

15.
bioRxiv ; 2024 Feb 12.
Artículo en Inglés | MEDLINE | ID: mdl-38405697

RESUMEN

Clustering is commonly used in single-cell RNA-sequencing (scRNA-seq) pipelines to characterize cellular heterogeneity. However, current methods face two main limitations. First, they require user-specified heuristics which add time and complexity to bioinformatic workflows; second, they rely on post-selective differential expression analyses to identify marker genes driving cluster differences, which has been shown to be subject to inflated false discovery rates. We address these challenges by introducing nonparametric clustering of single-cell populations (NCLUSION): an infinite mixture model that leverages Bayesian sparse priors to identify marker genes while simultaneously performing clustering on single-cell expression data. NCLUSION uses a scalable variational inference algorithm to perform these analyses on datasets with up to millions of cells. By analyzing publicly available scRNA-seq studies, we demonstrate that NCLUSION (i) matches the performance of other state-of-the-art clustering techniques with significantly reduced runtime and (ii) provides statistically robust and biologically relevant transcriptomic signatures for each of the clusters it identifies. Overall, NCLUSION represents a reliable hypothesis-generating tool for understanding patterns of expression variation present in single-cell populations.

16.
Nat Biotechnol ; 2024 Oct 07.
Artículo en Inglés | MEDLINE | ID: mdl-39375446

RESUMEN

High-throughput phenotypic screens using biochemical perturbations and high-content readouts are constrained by limitations of scale. To address this, we establish a method of pooling exogenous perturbations followed by computational deconvolution to reduce required sample size, labor and cost. We demonstrate the increased efficiency of compressed experimental designs compared to conventional approaches through benchmarking with a bioactive small-molecule library and a high-content imaging readout. We then apply compressed screening in two biological discovery campaigns. In the first, we use early-passage pancreatic cancer organoids to map transcriptional responses to a library of recombinant tumor microenvironment protein ligands, uncovering reproducible phenotypic shifts induced by specific ligands distinct from canonical reference signatures and correlated with clinical outcome. In the second, we identify the pleotropic modulatory effects of a chemical compound library with known mechanisms of action on primary human peripheral blood mononuclear cell immune responses. In sum, our approach empowers phenotypic screens with information-rich readouts to advance drug discovery efforts and basic biological inquiry.

17.
bioRxiv ; 2024 Jun 10.
Artículo en Inglés | MEDLINE | ID: mdl-38915726

RESUMEN

Efforts to cure BCR::ABL1 B cell acute lymphoblastic leukemia (Ph+ ALL) solely through inhibition of ABL1 kinase activity have thus far been insufficient despite the availability of tyrosine kinase inhibitors (TKIs) with broad activity against resistance mutants. The mechanisms that drive persistence within minimal residual disease (MRD) remain poorly understood and therefore untargeted. Utilizing 13 patient-derived xenograft (PDX) models and clinical trial specimens of Ph+ ALL, we examined how genetic and transcriptional features co-evolve to drive progression during prolonged TKI response. Our work reveals a landscape of cooperative mutational and transcriptional escape mechanisms that differ from those causing resistance to first generation TKIs. By analyzing MRD during remission, we show that the same resistance mutation can either increase or decrease cellular fitness depending on transcriptional state. We further demonstrate that directly targeting transcriptional state-associated vulnerabilities at MRD can overcome BCR::ABL1 independence, suggesting a new paradigm for rationally eradicating MRD prior to relapse. Finally, we illustrate how cell mass measurements of leukemia cells can be used to rapidly monitor dominant transcriptional features of Ph+ ALL to help rationally guide therapeutic selection from low-input samples.

18.
G3 (Bethesda) ; 13(8)2023 08 09.
Artículo en Inglés | MEDLINE | ID: mdl-37243672

RESUMEN

Epistasis, commonly defined as the interaction between genetic loci, is known to play an important role in the phenotypic variation of complex traits. As a result, many statistical methods have been developed to identify genetic variants that are involved in epistasis, and nearly all of these approaches carry out this task by focusing on analyzing one trait at a time. Previous studies have shown that jointly modeling multiple phenotypes can often dramatically increase statistical power for association mapping. In this study, we present the "multivariate MArginal ePIstasis Test" (mvMAPIT)-a multioutcome generalization of a recently proposed epistatic detection method which seeks to detect marginal epistasis or the combined pairwise interaction effects between a given variant and all other variants. By searching for marginal epistatic effects, one can identify genetic variants that are involved in epistasis without the need to identify the exact partners with which the variants interact-thus, potentially alleviating much of the statistical and computational burden associated with conventional explicit search-based methods. Our proposed mvMAPIT builds upon this strategy by taking advantage of correlation structure between traits to improve the identification of variants involved in epistasis. We formulate mvMAPIT as a multivariate linear mixed model and develop a multitrait variance component estimation algorithm for efficient parameter inference and P-value computation. Together with reasonable model approximations, our proposed approach is scalable to moderately sized genome-wide association studies. With simulations, we illustrate the benefits of mvMAPIT over univariate (or single-trait) epistatic mapping strategies. We also apply mvMAPIT framework to protein sequence data from two broadly neutralizing anti-influenza antibodies and approximately 2,000 heterogeneous stock of mice from the Wellcome Trust Centre for Human Genetics. The mvMAPIT R package can be downloaded at https://github.com/lcrawlab/mvMAPIT.


Asunto(s)
Epistasis Genética , Estudio de Asociación del Genoma Completo , Humanos , Animales , Ratones , Fenotipo , Sitios de Carácter Cuantitativo , Algoritmos
19.
iScience ; 25(7): 104553, 2022 Jul 15.
Artículo en Inglés | MEDLINE | ID: mdl-35769876

RESUMEN

In this paper, we propose a new approach for variable selection using a collection of Bayesian neural networks with a focus on quantifying uncertainty over which variables are selected. Motivated by fine-mapping applications in statistical genetics, we refer to our framework as an "ensemble of single-effect neural networks" (ESNN) which generalizes the "sum of single effects" regression framework by both accounting for nonlinear structure in genotypic data (e.g., dominance effects) and having the capability to model discrete phenotypes (e.g., case-control studies). Through extensive simulations, we demonstrate our method's ability to produce calibrated posterior summaries such as credible sets and posterior inclusion probabilities, particularly for traits with genetic architectures that have significant proportions of non-additive variation driven by correlated variants. Lastly, we use real data to demonstrate that the ESNN framework improves upon the state of the art for identifying true effect variables underlying various complex traits.

20.
Genome Biol ; 22(1): 213, 2021 07 23.
Artículo en Inglés | MEDLINE | ID: mdl-34301310

RESUMEN

Large-scale phenotype data can enhance the power of genomic prediction in plant and animal breeding, as well as human genetics. However, the statistical foundation of multi-trait genomic prediction is based on the multivariate linear mixed effect model, a tool notorious for its fragility when applied to more than a handful of traits. We present MegaLMM, a statistical framework and associated software package for mixed model analyses of a virtually unlimited number of traits. Using three examples with real plant data, we show that MegaLMM can leverage thousands of traits at once to significantly improve genetic value prediction accuracy.


Asunto(s)
Arabidopsis/genética , Genoma de Planta , Modelos Genéticos , Carácter Cuantitativo Heredable , Programas Informáticos , Triticum/genética , Zea mays/genética , Teorema de Bayes , Interacción Gen-Ambiente , Genómica , Genotipo , Humanos , Fenotipo , Fitomejoramiento
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA