Pesquisa | BVS Integralidade em Saúde

1.

Transcriptome variation in human tissues revealed by long-read sequencing.

Glinos, Dafni A; Garborcauskas, Garrett; Hoffman, Paul; Ehsan, Nava; Jiang, Lihua; Gokden, Alper; Dai, Xiaoguang; Aguet, François; Brown, Kathleen L; Garimella, Kiran; Bowers, Tera; Costello, Maura; Ardlie, Kristin; Jian, Ruiqi; Tucker, Nathan R; Ellinor, Patrick T; Harrington, Eoghan D; Tang, Hua; Snyder, Michael; Juul, Sissel; Mohammadi, Pejman; MacArthur, Daniel G; Lappalainen, Tuuli; Cummings, Beryl B.

Nature ; 608(7922): 353-359, 2022 08.

Artigo em Inglês | MEDLINE | ID: mdl-35922509

RESUMO

Regulation of transcript structure generates transcript diversity and plays an important role in human disease1-7. The advent of long-read sequencing technologies offers the opportunity to study the role of genetic variation in transcript structure8-16. In this Article, we present a large human long-read RNA-seq dataset using the Oxford Nanopore Technologies platform from 88 samples from Genotype-Tissue Expression (GTEx) tissues and cell lines, complementing the GTEx resource. We identified just over 70,000 novel transcripts for annotated genes, and validated the protein expression of 10% of novel transcripts. We developed a new computational package, LORALS, to analyse the genetic effects of rare and common variants on the transcriptome by allele-specific analysis of long reads. We characterized allele-specific expression and transcript structure events, providing new insights into the specific transcript alterations caused by common and rare genetic variants and highlighting the resolution gained from long-read data. We were able to perturb the transcript structure upon knockdown of PTBP1, an RNA binding protein that mediates splicing, thereby finding genetic regulatory effects that are modified by the cellular environment. Finally, we used this dataset to enhance variant interpretation and study rare variants leading to aberrant splicing patterns.

Assuntos

Alelos , Perfilação da Expressão Gênica , Especificidade de Órgãos , RNA-Seq , Transcriptoma , Processamento Alternativo/genética , Linhagem Celular , Conjuntos de Dados como Assunto , Genótipo , Ribonucleoproteínas Nucleares Heterogêneas/deficiência , Ribonucleoproteínas Nucleares Heterogêneas/genética , Humanos , Especificidade de Órgãos/genética , Proteína de Ligação a Regiões Ricas em Polipirimidinas/deficiência , Proteína de Ligação a Regiões Ricas em Polipirimidinas/genética , Reprodutibilidade dos Testes , Transcriptoma/genética

2.

Transcription factor regulation of eQTL activity across individuals and tissues.

Flynn, Elise D; Tsu, Athena L; Kasela, Silva; Kim-Hellmuth, Sarah; Aguet, Francois; Ardlie, Kristin G; Bussemaker, Harmen J; Mohammadi, Pejman; Lappalainen, Tuuli.

PLoS Genet ; 18(1): e1009719, 2022 01.

Artigo em Inglês | MEDLINE | ID: mdl-35100260

RESUMO

Tens of thousands of genetic variants associated with gene expression (cis-eQTLs) have been discovered in the human population. These eQTLs are active in various tissues and contexts, but the molecular mechanisms of eQTL variability are poorly understood, hindering our understanding of genetic regulation across biological contexts. Since many eQTLs are believed to act by altering transcription factor (TF) binding affinity, we hypothesized that analyzing eQTL effect size as a function of TF level may allow discovery of mechanisms of eQTL variability. Using GTEx Consortium eQTL data from 49 tissues, we analyzed the interaction between eQTL effect size and TF level across tissues and across individuals within specific tissues and generated a list of 10,098 TF-eQTL interactions across 2,136 genes that are supported by at least two lines of evidence. These TF-eQTLs were enriched for various TF binding measures, supporting with orthogonal evidence that these eQTLs are regulated by the implicated TFs. We also found that our TF-eQTLs tend to overlap genes with gene-by-environment regulatory effects and to colocalize with GWAS loci, implying that our approach can help to elucidate mechanisms of context-specificity and trait associations. Finally, we highlight an interesting example of IKZF1 TF regulation of an APBB1IP gene eQTL that colocalizes with a GWAS signal for blood cell traits. Together, our findings provide candidate TF mechanisms for a large number of eQTLs and offer a generalizable approach for researchers to discover TF regulators of genetic variant effects in additional QTL datasets.

Assuntos

Locos de Características Quantitativas , Fatores de Transcrição/fisiologia , Alelos , Sítios de Ligação , Técnicas de Silenciamento de Genes , Interação Gene-Ambiente , Estudo de Associação Genômica Ampla , Humanos , Fator Regulador 1 de Interferon/genética , Modelos Genéticos , Fenótipo , Fatores de Transcrição/metabolismo

3.

The regulatory landscape of multiple brain regions in outbred heterogeneous stock rats.

Munro, Daniel; Wang, Tengfei; Chitre, Apurva S; Polesskaya, Oksana; Ehsan, Nava; Gao, Jianjun; Gusev, Alexander; Woods, Leah C Solberg; Saba, Laura M; Chen, Hao; Palmer, Abraham A; Mohammadi, Pejman.

Nucleic Acids Res ; 50(19): 10882-10895, 2022 10 28.

Artigo em Inglês | MEDLINE | ID: mdl-36263809

RESUMO

Heterogeneous Stock (HS) rats are a genetically diverse outbred rat population that is widely used for studying genetics of behavioral and physiological traits. Mapping Quantitative Trait Loci (QTL) associated with transcriptional changes would help to identify mechanisms underlying these traits. We generated genotype and transcriptome data for five brain regions from 88 HS rats. We identified 21 392 cis-QTLs associated with expression and splicing changes across all five brain regions and validated their effects using allele specific expression data. We identified 80 cases where eQTLs were colocalized with genome-wide association study (GWAS) results from nine physiological traits. Comparing our dataset to human data from the Genotype-Tissue Expression (GTEx) project, we found that the HS rat data yields twice as many significant eQTLs as a similarly sized human dataset. We also identified a modest but highly significant correlation between genetic regulatory variation among orthologous genes. Surprisingly, we found less genetic variation in gene regulation in HS rats relative to humans, though we still found eQTLs for the orthologs of many human genes for which eQTLs had not been found. These data are available from the RatGTEx data portal (RatGTEx.org) and will enable new discoveries of the genetic influences of complex traits.

Assuntos

Estudo de Associação Genômica Ampla , Locos de Características Quantitativas , Animais , Ratos , Humanos , Locos de Características Quantitativas/genética , Transcriptoma , Genótipo , Encéfalo , Polimorfismo de Nucleotídeo Único

4.

Genetic variation near CXCL12 is associated with susceptibility to HIV-related non-Hodgkin lymphoma.

Thorball, Christian W; Oudot-Mellakh, Tiphaine; Ehsan, Nava; Hammer, Christian; Santoni, Federico A; Niay, Jonathan; Costagliola, Dominique; Goujard, Cécile; Meyer, Laurence; Wang, Sophia S; Hussain, Shehnaz K; Theodorou, Ioannis; Cavassini, Matthias; Rauch, Andri; Battegay, Manuel; Hoffmann, Matthias; Schmid, Patrick; Bernasconi, Enos; Günthard, Huldrych F; Mohammadi, Pejman; McLaren, Paul J; Rabkin, Charles S; Besson, Caroline; Fellay, Jacques.

Haematologica ; 106(8): 2233-2241, 2021 08 01.

Artigo em Inglês | MEDLINE | ID: mdl-32675224

RESUMO

Human immunodeficiency virus (HIV) infection is associated with an increased risk of non-Hodgkin lymphoma (NHL). Even in the era of suppressive antiretroviral treatment, HIV-infected individuals remain at higher risk of developing NHL compared to the general population. To identify potential genetic risk loci, we performed case-control genome-wide association studies and a meta-analysis across three cohorts of HIV+ patients of European ancestry, including a total of 278 cases and 1924 matched controls. We observed a significant association with NHL susceptibility in the C-X-C motif chemokine ligand 12 (CXCL12) region on chromosome 10. A fine mapping analysis identified rs7919208 as the most likely causal variant (P = 4.77e-11), with the G>A polymorphism creating a new transcription factor binding site for BATF and JUND. These results suggest a modulatory role of CXCL12 regulation in the increased susceptibility to NHL observed in the HIV-infected population.

Assuntos

Infecções por HIV , Linfoma Relacionado a AIDS , Linfoma não Hodgkin , Antirretrovirais/uso terapêutico , Estudos de Casos e Controles , Quimiocina CXCL12 , Estudo de Associação Genômica Ampla , Infecções por HIV/complicações , Infecções por HIV/tratamento farmacológico , Infecções por HIV/genética , Humanos , Linfoma Relacionado a AIDS/tratamento farmacológico , Linfoma não Hodgkin/tratamento farmacológico , Linfoma não Hodgkin/genética , Polimorfismo Genético

5.

Quantifying the regulatory effect size of cis-acting genetic variation using allelic fold change.

Mohammadi, Pejman; Castel, Stephane E; Brown, Andrew A; Lappalainen, Tuuli.

Genome Res ; 27(11): 1872-1884, 2017 11.

Artigo em Inglês | MEDLINE | ID: mdl-29021289

RESUMO

Mapping cis-acting expression quantitative trait loci (cis-eQTL) has become a popular approach for characterizing proximal genetic regulatory variants. In this paper, we describe and characterize log allelic fold change (aFC), the magnitude of expression change associated with a given genetic variant, as a biologically interpretable unit for quantifying the effect size of cis-eQTLs and a mathematically convenient approach for systematic modeling of cis-regulation. This measure is mathematically independent from expression level and allele frequency, additive, applicable to multiallelic variants, and generalizable to multiple independent variants. We provide efficient tools and guidelines for estimating aFC from both eQTL and allelic expression data sets and apply it to Genotype Tissue Expression (GTEx) data. We show that aFC estimates independently derived from eQTL and allelic expression data are highly consistent, and identify technical and biological correlates of eQTL effect size. We generalize aFC to analyze genes with two eQTLs in GTEx and show that in nearly all cases the two eQTLs act independently in regulating gene expression. In summary, aFC is a solid measure of cis-regulatory effect size that allows quantitative interpretation of cellular regulatory events from population data, and it is a valuable approach for investigating novel aspects of eQTL data sets.

Assuntos

Perfilação da Expressão Gênica/métodos , Redes Reguladoras de Genes , Locos de Características Quantitativas , Alelos , Bases de Dados Genéticas , Expressão Gênica , Variação Genética , Humanos , Modelos Teóricos

6.

BMix: probabilistic modeling of occurring substitutions in PAR-CLIP data.

Golumbeanu, Monica; Mohammadi, Pejman; Beerenwinkel, Niko.

Bioinformatics ; 32(7): 976-83, 2016 04 01.

Artigo em Inglês | MEDLINE | ID: mdl-26342229

RESUMO

MOTIVATION: Photoactivatable ribonucleoside-enhanced cross-linking and immunoprecipitation (PAR-CLIP) is an experimental method based on next-generation sequencing for identifying the RNA interaction sites of a given protein. The method deliberately inserts T-to-C substitutions at the RNA-protein interaction sites, which provides a second layer of evidence compared with other CLIP methods. However, the experiment includes several sources of noise which cause both low-frequency errors and spurious high-frequency alterations. Therefore, rigorous statistical analysis is required in order to separate true T-to-C base changes, following cross-linking, from noise. So far, most of the existing PAR-CLIP data analysis methods focus on discarding the low-frequency errors and rely on high-frequency substitutions to report binding sites, not taking into account the possibility of high-frequency false positive substitutions. RESULTS: Here, we introduce BMix, a new probabilistic method which explicitly accounts for the sources of noise in PAR-CLIP data and distinguishes cross-link induced T-to-C substitutions from low and high-frequency erroneous alterations. We demonstrate the superior speed and accuracy of our method compared with existing approaches on both simulated and real, publicly available human datasets. AVAILABILITY AND IMPLEMENTATION: The model is freely accessible within the BMix toolbox at www.cbg.bsse.ethz.ch/software/BMix, available for Matlab and R. SUPPLEMENTARY INFORMATION: Supplementary data is available at Bioinformatics online. CONTACT: niko.beerenwinkel@bsse.ethz.ch.

Assuntos

Sequenciamento de Nucleotídeos em Larga Escala , Modelos Estatísticos , Sítios de Ligação , Humanos , Imunoprecipitação , RNA , Análise de Sequência de RNA

7.

TiMEx: a waiting time model for mutually exclusive cancer alterations.

Constantinescu, Simona; Szczurek, Ewa; Mohammadi, Pejman; Rahnenführer, Jörg; Beerenwinkel, Niko.

Bioinformatics ; 32(7): 968-75, 2016 04 01.

Artigo em Inglês | MEDLINE | ID: mdl-26163509

RESUMO

MOTIVATION: Despite recent technological advances in genomic sciences, our understanding of cancer progression and its driving genetic alterations remains incomplete. RESULTS: We introduce TiMEx, a generative probabilistic model for detecting patterns of various degrees of mutual exclusivity across genetic alterations, which can indicate pathways involved in cancer progression. TiMEx explicitly accounts for the temporal interplay between the waiting times to alterations and the observation time. In simulation studies, we show that our model outperforms previous methods for detecting mutual exclusivity. On large-scale biological datasets, TiMEx identifies gene groups with strong functional biological relevance, while also proposing new candidates for biological validation. TiMEx possesses several advantages over previous methods, including a novel generative probabilistic model of tumorigenesis, direct estimation of the probability of mutual exclusivity interaction, computational efficiency and high sensitivity in detecting gene groups involving low-frequency alterations. AVAILABILITY AND IMPLEMENTATION: TiMEx is available as a Bioconductor R package at www.bsse.ethz.ch/cbg/software/TiMEx CONTACT: niko.beerenwinkel@bsse.ethz.ch SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Genômica , Modelos Teóricos , Neoplasias/genética , Algoritmos , Humanos , Mutação , Software

8.

Correction: 24 Hours in the Life of HIV-1 in a T Cell Line.

Mohammadi, Pejman; Desfarges, Sébastien; Bartha, István; Joos, Beda; Zangger, Nadine; Muñoz, Miguel; Günthard, Huldrych F; Beerenwinkel, Niko; Telenti, Amalio; Ciuffi, Angela.

PLoS Pathog ; 11(6): e1005006, 2015 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-26076473

RESUMO

[This corrects the article DOI: 10.1371/journal.ppat.1003161.].

9.

Dynamics of HIV latency and reactivation in a primary CD4+ T cell model.

Mohammadi, Pejman; di Iulio, Julia; Muñoz, Miguel; Martinez, Raquel; Bartha, István; Cavassini, Matthias; Thorball, Christian; Fellay, Jacques; Beerenwinkel, Niko; Ciuffi, Angela; Telenti, Amalio.

PLoS Pathog ; 10(5): e1004156, 2014 May.

Artigo em Inglês | MEDLINE | ID: mdl-24875931

RESUMO

HIV latency is a major obstacle to curing infection. Current strategies to eradicate HIV aim at increasing transcription of the latent provirus. In the present study we observed that latently infected CD4+ T cells from HIV-infected individuals failed to produce viral particles upon ex vivo exposure to SAHA (vorinostat), despite effective inhibition of histone deacetylases. To identify steps that were not susceptible to the action of SAHA or other latency reverting agents, we used a primary CD4+ T cell model, joint host and viral RNA sequencing, and a viral-encoded reporter. This model served to investigate the characteristics of latently infected cells, the dynamics of HIV latency, and the process of reactivation induced by various stimuli. During latency, we observed persistence of viral transcripts but only limited viral translation. Similarly, the reactivating agents SAHA and disulfiram successfully increased viral transcription, but failed to effectively enhance viral translation, mirroring the ex vivo data. This study highlights the importance of post-transcriptional blocks as one mechanism leading to HIV latency that needs to be relieved in order to purge the viral reservoir.

Assuntos

Linfócitos T CD4-Positivos/virologia , Infecções por HIV/virologia , HIV-1 , Latência Viral/imunologia , Replicação Viral , Linfócitos T CD4-Positivos/imunologia , Células Cultivadas , Humanos , Modelos Imunológicos , RNA Viral/genética , Integração Viral/genética , Latência Viral/genética

10.

The Characteristics of Heterozygous Protein Truncating Variants in the Human Genome.

Bartha, István; Rausell, Antonio; McLaren, Paul J; Mohammadi, Pejman; Tardaguila, Manuel; Chaturvedi, Nimisha; Fellay, Jacques; Telenti, Amalio.

PLoS Comput Biol ; 11(12): e1004647, 2015 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-26642228

RESUMO

Sequencing projects have identified large numbers of rare stop-gain and frameshift variants in the human genome. As most of these are observed in the heterozygous state, they test a gene's tolerance to haploinsufficiency and dominant loss of function. We analyzed the distribution of truncating variants across 16,260 autosomal protein coding genes in 11,546 individuals. We observed 39,893 truncating variants affecting 12,062 genes, which significantly differed from an expectation of 12,916 genes under a model of neutral de novo mutation (p<10-4). Extrapolating this to increasing numbers of sequenced individuals, we estimate that 10.8% of human genes do not tolerate heterozygous truncating variants. An additional 10 to 15% of truncated genes may be rescued by incomplete penetrance or compensatory mutations, or because the truncating variants are of limited functional impact. The study of protein truncating variants delineates the essential genome and, more generally, identifies rare heterozygous variants as an unexplored source of diversity of phenotypic traits and diseases.

Assuntos

Mapeamento Cromossômico/métodos , Códon sem Sentido/genética , Variação Genética/genética , Genoma Humano/genética , Proteínas/genética , Sequência de Bases , Humanos , Dados de Sequência Molecular

11.

24 hours in the life of HIV-1 in a T cell line.

Mohammadi, Pejman; Desfarges, Sébastien; Bartha, István; Joos, Beda; Zangger, Nadine; Muñoz, Miguel; Günthard, Huldrych F; Beerenwinkel, Niko; Telenti, Amalio; Ciuffi, Angela.

PLoS Pathog ; 9(1): e1003161, 2013 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-23382686

RESUMO

HIV-1 infects CD4+ T cells and completes its replication cycle in approximately 24 hours. We employed repeated measurements in a standardized cell system and rigorous mathematical modeling to characterize the emergence of the viral replication intermediates and their impact on the cellular transcriptional response with high temporal resolution. We observed 7,991 (73%) of the 10,958 expressed genes to be modulated in concordance with key steps of viral replication. Fifty-two percent of the overall variability in the host transcriptome was explained by linear regression on the viral life cycle. This profound perturbation of cellular physiology was investigated in the light of several regulatory mechanisms, including transcription factors, miRNAs, host-pathogen interaction, and proviral integration. Key features were validated in primary CD4+ T cells, and with viral constructs using alternative entry strategies. We propose a model of early massive cellular shutdown and progressive upregulation of the cellular machinery to complete the viral life cycle.

Assuntos

Linfócitos T CD4-Positivos/fisiologia , Regulação Viral da Expressão Gênica , HIV-1/fisiologia , Replicação Viral/genética , Linfócitos T CD4-Positivos/virologia , Células HEK293 , Interações Hospedeiro-Patógeno , Humanos , Modelos Estatísticos , Fatores de Tempo , Transcriptoma , Regulação para Cima

12.

Bioinformatics and HIV latency.

Ciuffi, Angela; Mohammadi, Pejman; Golumbeanu, Monica; di Iulio, Julia; Telenti, Amalio.

Curr HIV/AIDS Rep ; 12(1): 97-106, 2015 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-25586146

RESUMO

Despite effective treatment, HIV is not completely eliminated from the infected organism because of the existence of viral reservoirs. A major reservoir consists of infected resting CD4+ T cells, mostly of memory type, that persist over time due to the stable proviral insertion and a long cellular lifespan. Resting cells do not produce viral particles and are protected from viral-induced cytotoxicity or immune killing. However, these latently infected cells can be reactivated by stochastic events or by external stimuli. The present review focuses on novel genome-wide technologies applied to the study of integration, transcriptome, and proteome characteristics and their recent contribution to the understanding of HIV latency.

Assuntos

Biologia Computacional/métodos , HIV/fisiologia , Latência Viral/fisiologia , Humanos

13.

Analysis of stop-gain and frameshift variants in human innate immunity genes.

Rausell, Antonio; Mohammadi, Pejman; McLaren, Paul J; Bartha, Istvan; Xenarios, Ioannis; Fellay, Jacques; Telenti, Amalio.

PLoS Comput Biol ; 10(7): e1003757, 2014 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-25058640

RESUMO

Loss-of-function variants in innate immunity genes are associated with Mendelian disorders in the form of primary immunodeficiencies. Recent resequencing projects report that stop-gains and frameshifts are collectively prevalent in humans and could be responsible for some of the inter-individual variability in innate immune response. Current computational approaches evaluating loss-of-function in genes carrying these variants rely on gene-level characteristics such as evolutionary conservation and functional redundancy across the genome. However, innate immunity genes represent a particular case because they are more likely to be under positive selection and duplicated. To create a ranking of severity that would be applicable to innate immunity genes we evaluated 17,764 stop-gain and 13,915 frameshift variants from the NHLBI Exome Sequencing Project and 1,000 Genomes Project. Sequence-based features such as loss of functional domains, isoform-specific truncation and nonsense-mediated decay were found to correlate with variant allele frequency and validated with gene expression data. We integrated these features in a Bayesian classification scheme and benchmarked its use in predicting pathogenic variants against Online Mendelian Inheritance in Man (OMIM) disease stop-gains and frameshifts. The classification scheme was applied in the assessment of 335 stop-gains and 236 frameshifts affecting 227 interferon-stimulated genes. The sequence-based score ranks variants in innate immunity genes according to their potential to cause disease, and complements existing gene-based pathogenicity scores. Specifically, the sequence-based score improves measurement of functional gene impairment, discriminates across different variants in a given gene and appears particularly useful for analysis of less conserved genes.

Assuntos

Biologia Computacional/métodos , Bases de Dados Genéticas , Imunidade Inata/genética , Mutação/genética , Teorema de Bayes , Humanos , Interferons/genética , Interferons/metabolismo , Viroses/imunologia

14.

Recalibrating differential gene expression by genetic dosage variance prioritizes functionally relevant genes.

Rentzsch, Philipp; Kollotzek, Aaron; Mohammadi, Pejman; Lappalainen, Tuuli.

bioRxiv ; 2024 Apr 10.

Artigo em Inglês | MEDLINE | ID: mdl-38645217

RESUMO

Differential expression (DE) analysis is a widely used method for identifying genes that are functionally relevant for an observed phenotype or biological response. However, typical DE analysis includes selection of genes based on a threshold of fold change in expression under the implicit assumption that all genes are equally sensitive to dosage changes of their transcripts. This tends to favor highly variable genes over more constrained genes where even small changes in expression may be biologically relevant. To address this limitation, we have developed a method to recalibrate each gene's differential expression fold change based on genetic expression variance observed in the human population. The newly established metric ranks statistically differentially expressed genes not by nominal change of expression, but by relative change in comparison to natural dosage variation for each gene. We apply our method to RNA sequencing datasets from rare disease and in-vitro stimulus response experiments. Compared to the standard approach, our method adjusts the bias in discovery towards highly variable genes, and enriches for pathways and biological processes related to metabolic and regulatory activity, indicating a prioritization of functionally relevant driver genes. With that, our method provides a novel view on DE and contributes towards bridging the existing gap between statistical and biological significance. We believe that this approach will simplify the identification of disease causing genes and enhance the discovery of therapeutic targets.

15.

Multimodal analysis of RNA sequencing data powers discovery of complex trait genetics.

Munro, Daniel; Ehsan, Nava; Esmaeili-Fard, Seyed Mehdi; Gusev, Alexander; Palmer, Abraham A; Mohammadi, Pejman.

bioRxiv ; 2024 May 15.

Artigo em Inglês | MEDLINE | ID: mdl-38798366

RESUMO

Transcriptome data is commonly used to understand genome function via quantitative trait loci (QTL) mapping and to identify the molecular mechanisms driving genome wide association study (GWAS) signals through colocalization analysis and transcriptome-wide association studies (TWAS). While RNA sequencing (RNA-seq) has the potential to reveal many modalities of transcriptional regulation, such as various splicing phenotypes, such studies are often limited to gene expression due to the complexity of extracting and analyzing multiple RNA phenotypes. Here, we present Pantry (Pan-transcriptomic phenotyping), a framework to efficiently generate diverse RNA phenotypes from RNA-seq data and perform downstream integrative analyses with genetic data. Pantry currently generates phenotypes from six modalities of transcriptional regulation (gene expression, isoform ratios, splice junction usage, alternative TSS/polyA usage, and RNA stability) and integrates them with genetic data via QTL mapping, TWAS, and colocalization testing. We applied Pantry to Geuvadis and GTEx data, and found that 4,768 of the genes with no identified expression QTL in Geuvadis had QTLs in at least one other transcriptional modality, resulting in a 66% increase in genes over expression QTL mapping. We further found that QTLs exhibit modality-specific functional properties that are further reinforced by joint analysis of different RNA modalities. We also show that generalizing TWAS to multiple RNA modalities (xTWAS) approximately doubles the discovery of unique gene-trait associations, and enhances identification of regulatory mechanisms underlying GWAS signal in 42% of previously associated gene-trait pairs. We provide the Pantry code, RNA phenotypes from all Geuvadis and GTEx samples, and xQTL and xTWAS results on the web.

16.

Haplotype-aware modeling of cis-regulatory effects highlights the gaps remaining in eQTL data.

Ehsan, Nava; Kotis, Bence M; Castel, Stephane E; Song, Eric J; Mancuso, Nicholas; Mohammadi, Pejman.

Nat Commun ; 15(1): 522, 2024 Jan 15.

Artigo em Inglês | MEDLINE | ID: mdl-38225224

RESUMO

Expression Quantitative Trait Loci (eQTLs) are critical to understanding the mechanisms underlying disease-associated genomic loci. Nearly all protein-coding genes in the human genome have been associated with one or more eQTLs. Here we introduce a multi-variant generalization of allelic Fold Change (aFC), aFC-n, to enable quantification of the cis-regulatory effects in multi-eQTL genes under the assumption that all eQTLs are known and conditionally independent. Applying aFC-n to 458,465 eQTLs in the Genotype-Tissue Expression (GTEx) project data, we demonstrate significant improvements in accuracy over the original model in estimating the eQTL effect sizes and in predicting genetically regulated gene expression over the current tools. We characterize some of the empirical properties of the eQTL data and use this framework to assess the current state of eQTL data in terms of characterizing cis-regulatory landscape in individual genomes. Notably, we show that 77.4% of the genes with an allelic imbalance in a sample show 0.5 log2 fold or more of residual imbalance after accounting for the eQTL data underlining the remaining gap in characterizing regulatory landscape in individual genomes. We further contrast this gap across tissue types, and ancestry backgrounds to identify its correlates and guide future studies.

Assuntos

Genômica , Locos de Características Quantitativas , Humanos , Haplótipos , Locos de Características Quantitativas/genética , Alelos , Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Perfilação da Expressão Gênica

17.

Improved multi-ancestry fine-mapping identifies cis-regulatory variants underlying molecular traits and disease risk.

Lu, Zeyun; Wang, Xinran; Carr, Matthew; Kim, Artem; Gazal, Steven; Mohammadi, Pejman; Wu, Lang; Gusev, Alexander; Pirruccello, James; Kachuri, Linda; Mancuso, Nicholas.

medRxiv ; 2024 Apr 16.

Artigo em Inglês | MEDLINE | ID: mdl-38699369

RESUMO

Multi-ancestry statistical fine-mapping of cis-molecular quantitative trait loci (cis-molQTL) aims to improve the precision of distinguishing causal cis-molQTLs from tagging variants. However, existing approaches fail to reflect shared genetic architectures. To solve this limitation, we present the Sum of Shared Single Effects (SuShiE) model, which leverages LD heterogeneity to improve fine-mapping precision, infer cross-ancestry effect size correlations, and estimate ancestry-specific expression prediction weights. We apply SuShiE to mRNA expression measured in PBMCs (n=956) and LCLs (n=814) together with plasma protein levels (n=854) from individuals of diverse ancestries in the TOPMed MESA and GENOA studies. We find SuShiE fine-maps cis-molQTLs for 16% more genes compared with baselines while prioritizing fewer variants with greater functional enrichment. SuShiE infers highly consistent cis-molQTL architectures across ancestries on average; however, we also find evidence of heterogeneity at genes with predicted loss-of-function intolerance, suggesting that environmental interactions may partially explain differences in cis-molQTL effect sizes across ancestries. Lastly, we leverage estimated cis-molQTL effect-sizes to perform individual-level TWAS and PWAS on six white blood cell-related traits in AOU Biobank individuals (n=86k), and identify 44 more genes compared with baselines, further highlighting its benefits in identifying genes relevant for complex disease risk. Overall, SuShiE provides new insights into the cis-genetic architecture of molecular traits.

18.

A revamped rat reference genome improves the discovery of genetic diversity in laboratory rats.

de Jong, Tristan V; Pan, Yanchao; Rastas, Pasi; Munro, Daniel; Tutaj, Monika; Akil, Huda; Benner, Chris; Chen, Denghui; Chitre, Apurva S; Chow, William; Colonna, Vincenza; Dalgard, Clifton L; Demos, Wendy M; Doris, Peter A; Garrison, Erik; Geurts, Aron M; Gunturkun, Hakan M; Guryev, Victor; Hourlier, Thibaut; Howe, Kerstin; Huang, Jun; Kalbfleisch, Ted; Kim, Panjun; Li, Ling; Mahaffey, Spencer; Martin, Fergal J; Mohammadi, Pejman; Ozel, Ayse Bilge; Polesskaya, Oksana; Pravenec, Michal; Prins, Pjotr; Sebat, Jonathan; Smith, Jennifer R; Solberg Woods, Leah C; Tabakoff, Boris; Tracey, Alan; Uliano-Silva, Marcela; Villani, Flavia; Wang, Hongyang; Sharp, Burt M; Telese, Francesca; Jiang, Zhihua; Saba, Laura; Wang, Xusheng; Murphy, Terence D; Palmer, Abraham A; Kwitek, Anne E; Dwinell, Melinda R; Williams, Robert W; Li, Jun Z.

Cell Genom ; 4(4): 100527, 2024 Apr 10.

Artigo em Inglês | MEDLINE | ID: mdl-38537634

RESUMO

The seventh iteration of the reference genome assembly for Rattus norvegicus-mRatBN7.2-corrects numerous misplaced segments and reduces base-level errors by approximately 9-fold and increases contiguity by 290-fold compared with its predecessor. Gene annotations are now more complete, improving the mapping precision of genomic, transcriptomic, and proteomics datasets. We jointly analyzed 163 short-read whole-genome sequencing datasets representing 120 laboratory rat strains and substrains using mRatBN7.2. We defined â¼20.0 million sequence variations, of which 18,700 are predicted to potentially impact the function of 6,677 genes. We also generated a new rat genetic map from 1,893 heterogeneous stock rats and annotated transcription start sites and alternative polyadenylation sites. The mRatBN7.2 assembly, along with the extensive analysis of genomic variations among rat strains, enhances our understanding of the rat genome, providing researchers with an expanded resource for studies involving rats.

Assuntos

Genoma , Genômica , Ratos , Animais , Genoma/genética , Anotação de Sequência Molecular , Sequenciamento Completo do Genoma , Variação Genética/genética

19.

Genome-Wide Association Study of Chronic Dizziness in the Elderly Identifies Loci Implicating MLLT10, BPTF, LINC01224, and ROS1.

Clifford, Royce; Munro, Daniel; Dochtermann, Daniel; Devineni, Poornima; Pyarajan, Saiju; Telese, Francesca; Palmer, Abraham A; Mohammadi, Pejman; Friedman, Rick.

J Assoc Res Otolaryngol ; 24(6): 575-591, 2023 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-38036714

RESUMO

PURPOSE: Chronic age-related imbalance is a common cause of falls and subsequent death in the elderly and can arise from dysfunction of the vestibular system, an elegant neuroanatomical group of pathways that mediates human perception of acceleration, gravity, and angular head motion. Studies indicate that 27-46% of the risk of age-related chronic imbalance is genetic; nevertheless, the underlying genes remain unknown. METHODS: The cohort consisted of 50,339 cases and 366,900 controls in the Million Veteran Program. The phenotype comprised cases with two ICD diagnoses of vertigo or dizziness at least 6 months apart, excluding acute or recurrent vertiginous syndromes and other non-vestibular disorders. Genome-wide association studies were performed as individual logistic regressions on European, African American, and Hispanic ancestries followed by trans-ancestry meta-analysis. Downstream analysis included case-case-GWAS, fine mapping, probabilistic colocalization of significant variants and genes with eQTLs, and functional analysis of significant hits. RESULTS: Two significant loci were identified in Europeans, another in the Hispanic population, and two additional in trans-ancestry meta-analysis, including three novel loci. Fine mapping revealed credible sets of intronic single nucleotide polymorphisms (SNPs) in MLLT10 - a histone methyl transferase cofactor, BPTF - a subunit of a nucleosome remodeling complex implicated in neurodevelopment, and LINC01224 - a proto-oncogene receptor tyrosine kinase. CONCLUSION: Despite the difficulties of phenotyping the nature of chronic imbalance, we replicated two loci from previous vertigo GWAS studies and identified three novel loci. Findings suggest candidates for further study and ultimate treatment of this common elderly disorder.

Assuntos

Estudo de Associação Genômica Ampla , Proteínas Tirosina Quinases , Humanos , Idoso , Proteínas Tirosina Quinases/genética , Tontura/genética , Proteínas Proto-Oncogênicas/genética , Vertigem , Polimorfismo de Nucleotídeo Único , Predisposição Genética para Doença , Fatores de Transcrição/genética

20.

RNA-seq data science: From raw data to effective interpretation.

Deshpande, Dhrithi; Chhugani, Karishma; Chang, Yutong; Karlsberg, Aaron; Loeffler, Caitlin; Zhang, Jinyang; Muszynska, Agata; Munteanu, Viorel; Yang, Harry; Rotman, Jeremy; Tao, Laura; Balliu, Brunilda; Tseng, Elizabeth; Eskin, Eleazar; Zhao, Fangqing; Mohammadi, Pejman; P Labaj, Pawel; Mangul, Serghei.

Front Genet ; 14: 997383, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-36999049

RESUMO

RNA sequencing (RNA-seq) has become an exemplary technology in modern biology and clinical science. Its immense popularity is due in large part to the continuous efforts of the bioinformatics community to develop accurate and scalable computational tools to analyze the enormous amounts of transcriptomic data that it produces. RNA-seq analysis enables genes and their corresponding transcripts to be probed for a variety of purposes, such as detecting novel exons or whole transcripts, assessing expression of genes and alternative transcripts, and studying alternative splicing structure. It can be a challenge, however, to obtain meaningful biological signals from raw RNA-seq data because of the enormous scale of the data as well as the inherent limitations of different sequencing technologies, such as amplification bias or biases of library preparation. The need to overcome these technical challenges has pushed the rapid development of novel computational tools, which have evolved and diversified in accordance with technological advancements, leading to the current myriad of RNA-seq tools. These tools, combined with the diverse computational skill sets of biomedical researchers, help to unlock the full potential of RNA-seq. The purpose of this review is to explain basic concepts in the computational analysis of RNA-seq data and define discipline-specific jargon.

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

Detalhe da pesquisa