Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 100
Filtrar
1.
Nat Rev Genet ; 2024 Oct 02.
Artigo em Inglês | MEDLINE | ID: mdl-39358547

RESUMO

Since the discovery of RNA splicing and its role in gene expression, researchers have sought a set of rules, an algorithm or a computational model that could predict the splice isoforms, and their frequencies, produced from any transcribed gene in a specific cellular context. Over the past 30 years, these models have evolved from simple position weight matrices to deep-learning models capable of integrating sequence data across vast genomic distances. Most recently, new model architectures are moving the field closer to context-specific alternative splicing predictions, and advances in sequencing technologies are expanding the type of data that can be used to inform and interpret such models. Together, these developments are driving improved understanding of splicing regulatory mechanisms and emerging applications of the splicing code to the rational design of RNA- and splicing-based therapeutics.

2.
Cell ; 155(5): 1075-87, 2013 Nov 21.
Artigo em Inglês | MEDLINE | ID: mdl-24210918

RESUMO

Pervasive transcription of eukaryotic genomes stems to a large extent from bidirectional promoters that synthesize mRNA and divergent noncoding RNA (ncRNA). Here, we show that ncRNA transcription in the yeast S. cerevisiae is globally restricted by early termination that relies on the essential RNA-binding factor Nrd1. Depletion of Nrd1 from the nucleus results in 1,526 Nrd1-unterminated transcripts (NUTs) that originate from nucleosome-depleted regions (NDRs) and can deregulate mRNA synthesis by antisense repression and transcription interference. Transcriptome-wide Nrd1-binding maps reveal divergent NUTs at most promoters and antisense NUTs in most 3' regions of genes. Nrd1 and its partner Nab3 preferentially bind RNA motifs that are depleted in mRNAs and enriched in ncRNAs and some mRNAs whose synthesis is controlled by transcription attenuation. These results define a global mechanism for transcriptome surveillance that selectively terminates ncRNA synthesis to provide promoter directionality and to suppress antisense transcription.


Assuntos
RNA Fúngico/genética , RNA não Traduzido/genética , Proteínas de Ligação a RNA/metabolismo , Proteínas de Saccharomyces cerevisiae/metabolismo , Saccharomyces cerevisiae/metabolismo , Terminação da Transcrição Genética , Transcriptoma , Regulação para Baixo , Proteínas Nucleares/metabolismo , Regiões Promotoras Genéticas , RNA Antissenso/metabolismo , Saccharomyces cerevisiae/genética
3.
Nat Methods ; 21(1): 28-31, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-38049697

RESUMO

Single-cell ATAC sequencing coverage in regulatory regions is typically binarized as an indicator of open chromatin. Here we show that binarization is an unnecessary step that neither improves goodness of fit, clustering, cell type identification nor batch integration. Fragment counts, but not read counts, should instead be modeled, which preserves quantitative regulatory information. These results have immediate implications for single-cell ATAC sequencing analysis.


Assuntos
Sequenciamento de Cromatina por Imunoprecipitação , Sequenciamento de Nucleotídeos em Larga Escala , Análise de Sequência de DNA/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Cromatina/genética , Análise de Célula Única
4.
Am J Hum Genet ; 110(12): 2056-2067, 2023 Dec 07.
Artigo em Inglês | MEDLINE | ID: mdl-38006880

RESUMO

Detection of aberrantly spliced genes is an important step in RNA-seq-based rare-disease diagnostics. We recently developed FRASER, a denoising autoencoder-based method that outperformed alternative methods of detecting aberrant splicing. However, because FRASER's three splice metrics are partially redundant and tend to be sensitive to sequencing depth, we introduce here a more robust intron-excision metric, the intron Jaccard index, that combines the alternative donor, alternative acceptor, and intron-retention signal into a single value. Moreover, we optimized model parameters and filter cutoffs by using candidate rare-splice-disrupting variants as independent evidence. On 16,213 GTEx samples, our improved algorithm, FRASER 2.0, called typically 10 times fewer splicing outliers while increasing the proportion of candidate rare-splice-disrupting variants by 10-fold and substantially decreasing the effect of sequencing depth on the number of reported outliers. To lower the multiple-testing correction burden, we introduce an option to select the genes to be tested for each sample instead of a transcriptome-wide approach. This option can be particularly useful when prior information, such as candidate variants or genes, is available. Application on 303 rare-disease samples confirmed the relative reduction in the number of outlier calls for a slight loss of sensitivity; FRASER 2.0 recovered 22 out of 26 previously identified pathogenic splicing cases with default cutoffs and 24 when multiple-testing correction was limited to OMIM genes containing rare variants. Altogether, these methodological improvements contribute to more effective RNA-seq-based rare diagnostics by drastically reducing the amount of splicing outlier calls per sample at minimal loss of sensitivity.


Assuntos
Processamento Alternativo , Splicing de RNA , Humanos , Processamento Alternativo/genética , Íntrons/genética , Splicing de RNA/genética , RNA-Seq , Algoritmos
5.
Mol Syst Biol ; 20(5): 506-520, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38491213

RESUMO

Codon optimality is a major determinant of mRNA translation and degradation rates. However, whether and through which mechanisms its effects are regulated remains poorly understood. Here we show that codon optimality associates with up to 2-fold change in mRNA stability variations between human tissues, and that its effect is attenuated in tissues with high energy metabolism and amplifies with age. Mathematical modeling and perturbation data through oxygen deprivation and ATP synthesis inhibition reveal that cellular energy variations non-uniformly alter the effect of codon usage. This new mode of codon effect regulation, independent of tRNA regulation, provides a fundamental mechanistic link between cellular energy metabolism and eukaryotic gene expression.


Assuntos
Códon , Metabolismo Energético , Estabilidade de RNA , RNA Mensageiro , Humanos , Metabolismo Energético/genética , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Códon/genética , Uso do Códon , Biossíntese de Proteínas , RNA de Transferência/genética , RNA de Transferência/metabolismo , Trifosfato de Adenosina/metabolismo , Regulação da Expressão Gênica
6.
Nat Rev Genet ; 20(7): 389-403, 2019 07.
Artigo em Inglês | MEDLINE | ID: mdl-30971806

RESUMO

As a data-driven science, genomics largely utilizes machine learning to capture dependencies in data and derive novel biological hypotheses. However, the ability to extract new insights from the exponentially increasing volume of genomics data requires more expressive machine learning models. By effectively leveraging large data sets, deep learning has transformed fields such as computer vision and natural language processing. Now, it is becoming the method of choice for many genomics modelling tasks, including predicting the impact of genetic variation on gene regulatory mechanisms such as DNA accessibility and splicing.


Assuntos
Aprendizado Profundo , Genômica/métodos , Modelos Genéticos , Redes Neurais de Computação , Sequência de Bases , Simulação por Computador , Humanos , Aprendizado de Máquina Supervisionado , Aprendizado de Máquina não Supervisionado
7.
Nucleic Acids Res ; 51(4): e21, 2023 02 28.
Artigo em Inglês | MEDLINE | ID: mdl-36617985

RESUMO

Transposon screens are powerful in vivo assays used to identify loci driving carcinogenesis. These loci are identified as Common Insertion Sites (CISs), i.e. regions with more transposon insertions than expected by chance. However, the identification of CISs is affected by biases in the insertion behaviour of transposon systems. Here, we introduce Transmicron, a novel method that differs from previous methods by (i) modelling neutral insertion rates based on chromatin accessibility, transcriptional activity and sequence context and (ii) estimating oncogenic selection for each genomic region using Poisson regression to model insertion counts while controlling for neutral insertion rates. To assess the benefits of our approach, we generated a dataset applying two different transposon systems under comparable conditions. Benchmarking for enrichment of known cancer genes showed improved performance of Transmicron against state-of-the-art methods. Modelling neutral insertion rates allowed for better control of false positives and stronger agreement of the results between transposon systems. Moreover, using Poisson regression to consider intra-sample and inter-sample information proved beneficial in small and moderately-sized datasets. Transmicron is open-source and freely available. Overall, this study contributes to the understanding of transposon biology and introduces a novel approach to use this knowledge for discovering cancer driver genes.


Assuntos
Elementos de DNA Transponíveis , Neoplasias , Software , Humanos , Sequência de Bases , Carcinogênese , Mutagênese Insercional , Oncogenes , Neoplasias/genética
8.
Bioinformatics ; 39(2)2023 02 03.
Artigo em Inglês | MEDLINE | ID: mdl-36708003

RESUMO

MOTIVATION: Identifying regulatory regions in the genome is of great interest for understanding the epigenomic landscape in cells. One fundamental challenge in this context is to find the target genes whose expression is affected by the regulatory regions. A recent successful method is the Activity-By-Contact (ABC) model which scores enhancer-gene interactions based on enhancer activity and the contact frequency of an enhancer to its target gene. However, it describes regulatory interactions entirely from a gene's perspective, and does not account for all the candidate target genes of an enhancer. In addition, the ABC model requires two types of assays to measure enhancer activity, which limits the applicability. Moreover, there is neither implementation available that could allow for an integration with transcription factor (TF) binding information nor an efficient analysis of single-cell data. RESULTS: We demonstrate that the ABC score can yield a higher accuracy by adapting the enhancer activity according to the number of contacts the enhancer has to its candidate target genes and also by considering all annotated transcription start sites of a gene. Further, we show that the model is comparably accurate with only one assay to measure enhancer activity. We combined our generalized ABC model with TF binding information and illustrated an analysis of a single-cell ATAC-seq dataset of the human heart, where we were able to characterize cell type-specific regulatory interactions and predict gene expression based on TF affinities. All executed processing steps are incorporated into our new computational pipeline STARE. AVAILABILITY AND IMPLEMENTATION: The software is available at https://github.com/schulzlab/STARE. CONTACT: marcel.schulz@em.uni-frankfurt.de. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Regulação da Expressão Gênica , Fatores de Transcrição , Humanos , Fatores de Transcrição/metabolismo , Sequências Reguladoras de Ácido Nucleico , Software , Ligação Proteica
9.
Mol Genet Metab ; 142(3): 108511, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38878498

RESUMO

The diagnosis of Mendelian disorders has notably advanced with integration of whole exome and genome sequencing (WES and WGS) in clinical practice. However, challenges in variant interpretation and uncovered variants by WES still leave a substantial percentage of patients undiagnosed. In this context, integrating RNA sequencing (RNA-seq) improves diagnostic workflows, particularly for WES inconclusive cases. Additionally, functional studies are often necessary to elucidate the impact of prioritized variants on gene expression and protein function. Our study focused on three unrelated male patients (P1-P3) with ATP6AP1-CDG (congenital disorder of glycosylation), presenting with intellectual disability and varying degrees of hepatopathy, glycosylation defects, and an initially inconclusive diagnosis through WES. Subsequent RNA-seq was pivotal in identifying the underlying genetic causes in P1 and P2, detecting ATP6AP1 underexpression and aberrant splicing. Molecular studies in fibroblasts confirmed these findings and identified the rare intronic variants c.289-233C > T and c.289-289G > A in P1 and P2, respectively. Trio-WGS also revealed the variant c.289-289G > A in P3, which was a de novo change in both patients. Functional assays expressing the mutant alleles in HAP1 cells demonstrated the pathogenic impact of these variants by reproducing the splicing alterations observed in patients. Our study underscores the role of RNA-seq and WGS in enhancing diagnostic rates for genetic diseases such as CDG, providing new insights into ATP6AP1-CDG molecular bases by identifying the first two deep intronic variants in this X-linked gene. Additionally, our study highlights the need to integrate RNA-seq and WGS, followed by functional validation, in routine diagnostics for a comprehensive evaluation of patients with an unidentified molecular etiology.


Assuntos
Íntrons , RNA Mensageiro , Humanos , Masculino , Íntrons/genética , RNA Mensageiro/genética , ATPases Vacuolares Próton-Translocadoras/genética , Defeitos Congênitos da Glicosilação/genética , Defeitos Congênitos da Glicosilação/diagnóstico , Defeitos Congênitos da Glicosilação/patologia , Mutação , Sequenciamento Completo do Genoma , Sequenciamento do Exoma , Análise de Sequência de RNA , Deficiência Intelectual/genética , Deficiência Intelectual/diagnóstico , Deficiência Intelectual/patologia , Criança , Splicing de RNA/genética , Pré-Escolar
10.
Int J Mol Sci ; 25(14)2024 Jul 16.
Artigo em Inglês | MEDLINE | ID: mdl-39063034

RESUMO

Duchenne and Becker muscular dystrophies, caused by pathogenic variants in DMD, are the most common inherited neuromuscular conditions in childhood. These diseases follow an X-linked recessive inheritance pattern, and mainly males are affected. The most prevalent pathogenic variants in the DMD gene are copy number variants (CNVs), and most patients achieve their genetic diagnosis through Multiplex Ligation-dependent Probe Amplification (MLPA) or exome sequencing. Here, we investigated a female patient presenting with muscular dystrophy who remained genetically undiagnosed after MLPA and exome sequencing. RNA sequencing (RNAseq) from the patient's muscle biopsy identified an 85% reduction in DMD expression compared to 116 muscle samples included in the cohort. A de novo balanced translocation between chromosome 17 and the X chromosome (t(X;17)(p21.1;q23.2)) disrupting the DMD and BCAS3 genes was identified through trio whole genome sequencing (WGS). The combined analysis of RNAseq and WGS played a crucial role in the detection and characterisation of the disease-causing variant in this patient, who had been undiagnosed for over two decades. This case illustrates the diagnostic odyssey of female DMD patients with complex structural variants that are not detected by current panel or exome sequencing analysis.


Assuntos
Cromossomos Humanos X , Distrofina , Genômica , Distrofia Muscular de Duchenne , Translocação Genética , Humanos , Distrofia Muscular de Duchenne/genética , Distrofia Muscular de Duchenne/diagnóstico , Feminino , Distrofina/genética , Cromossomos Humanos X/genética , Genômica/métodos , Variações do Número de Cópias de DNA , Sequenciamento do Exoma , Transcriptoma/genética , Cromossomos Humanos Par 17/genética
11.
Basic Res Cardiol ; 117(1): 6, 2022 02 17.
Artigo em Inglês | MEDLINE | ID: mdl-35175464

RESUMO

The majority of risk loci identified by genome-wide association studies (GWAS) are in non-coding regions, hampering their functional interpretation. Instead, transcriptome-wide association studies (TWAS) identify gene-trait associations, which can be used to prioritize candidate genes in disease-relevant tissue(s). Here, we aimed to systematically identify susceptibility genes for coronary artery disease (CAD) by TWAS. We trained prediction models of nine CAD-relevant tissues using EpiXcan based on two genetics-of-gene-expression panels, the Stockholm-Tartu Atherosclerosis Reverse Network Engineering Task (STARNET) and the Genotype-Tissue Expression (GTEx). Based on these prediction models, we imputed gene expression of respective tissues from individual-level genotype data on 37,997 CAD cases and 42,854 controls for the subsequent gene-trait association analysis. Transcriptome-wide significant association (i.e. P < 3.85e-6) was observed for 114 genes. Of these, 96 resided within previously identified GWAS risk loci and 18 were novel. Stepwise analyses were performed to study their plausibility, biological function, and pathogenicity in CAD, including analyses for colocalization, damaging mutations, pathway enrichment, phenome-wide associations with human data and expression-traits correlations using mouse data. Finally, CRISPR/Cas9-based gene knockdown of two newly identified TWAS genes, RGS19 and KPTN, in a human hepatocyte cell line resulted in reduced secretion of APOB100 and lipids in the cell culture medium. Our CAD TWAS work (i) prioritized candidate causal genes at known GWAS loci, (ii) identified 18 novel genes to be associated with CAD, and iii) suggested potential tissues and pathways of action for these TWAS CAD genes.


Assuntos
Doença da Artéria Coronariana , Estudo de Associação Genômica Ampla , Animais , Doença da Artéria Coronariana/genética , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla/métodos , Camundongos , Polimorfismo de Nucleotídeo Único , Transcriptoma
12.
PLoS Comput Biol ; 17(5): e1008982, 2021 05.
Artigo em Inglês | MEDLINE | ID: mdl-33970899

RESUMO

The 5' untranslated region plays a key role in regulating mRNA translation and consequently protein abundance. Therefore, accurate modeling of 5'UTR regulatory sequences shall provide insights into translational control mechanisms and help interpret genetic variants. Recently, a model was trained on a massively parallel reporter assay to predict mean ribosome load (MRL)-a proxy for translation rate-directly from 5'UTR sequence with a high degree of accuracy. However, this model is restricted to sequence lengths investigated in the reporter assay and therefore cannot be applied to the majority of human sequences without a substantial loss of information. Here, we introduced frame pooling, a novel neural network operation that enabled the development of an MRL prediction model for 5'UTRs of any length. Our model shows state-of-the-art performance on fixed length randomized sequences, while offering better generalization performance on longer sequences and on a variety of translation-related genome-wide datasets. Variant interpretation is demonstrated on a 5'UTR variant of the gene HBB associated with beta-thalassemia. Frame pooling could find applications in other bioinformatics predictive tasks. Moreover, our model, released open source, could help pinpoint pathogenic genetic variants.


Assuntos
Regiões 5' não Traduzidas , Aprendizado Profundo , Ribossomos/metabolismo , Humanos , RNA Mensageiro/genética
13.
Int J Mol Sci ; 23(20)2022 Oct 15.
Artigo em Inglês | MEDLINE | ID: mdl-36293220

RESUMO

Peroxisomal biogenesis disorders (PBDs) are a heterogeneous group of genetic diseases. Multiple peroxisomal pathways are impaired, and very long chain fatty acids (VLCFA) are the first line biomarkers for the diagnosis. The clinical presentation of PBDs may range from severe, lethal multisystemic disorders to milder, late-onset disease. The vast majority of PBDs belong to Zellweger Spectrum Disordes (ZSDs) and represents a continuum of overlapping clinical symptoms, with Zellweger syndrome being the most severe and Heimler syndrome the less severe disease. Mild clinical conditions frequently present normal or slight biochemical alterations, making the diagnosis of these patients challenging. In the present study we used a combined WES and RNA-seq strategy to diagnose a patient presenting with retinal dystrophy as the main clinical symptom. Results showed the patient was compound heterozygous for mutations in PEX1. VLCFA were normal, but retrospective analysis of lysosphosphatidylcholines (LPC) containing C22:0-C26:0 species was altered. This simple test could avoid the diagnostic odyssey of patients with mild phenotype, such as the individual described here, who was diagnosed very late in adult life. We provide functional data in cell line models that may explain the mild phenotype of the patient by demonstrating the hypomorphic nature of a deep intronic variant altering PEX1 mRNA processing.


Assuntos
Surdez , Perda Auditiva Neurossensorial , Síndrome de Zellweger , Humanos , ATPases Associadas a Diversas Atividades Celulares/metabolismo , RNA-Seq , Estudos Retrospectivos , Proteínas de Membrana/genética , Proteínas de Membrana/metabolismo , Síndrome de Zellweger/diagnóstico , Síndrome de Zellweger/genética , Perda Auditiva Neurossensorial/genética , Biomarcadores , RNA Mensageiro , Ácidos Graxos
14.
Am J Hum Genet ; 103(6): 907-917, 2018 12 06.
Artigo em Inglês | MEDLINE | ID: mdl-30503520

RESUMO

RNA sequencing (RNA-seq) is gaining popularity as a complementary assay to genome sequencing for precisely identifying the molecular causes of rare disorders. A powerful approach is to identify aberrant gene expression levels as potential pathogenic events. However, existing methods for detecting aberrant read counts in RNA-seq data either lack assessments of statistical significance, so that establishing cutoffs is arbitrary, or rely on subjective manual corrections for confounders. Here, we describe OUTRIDER (Outlier in RNA-Seq Finder), an algorithm developed to address these issues. The algorithm uses an autoencoder to model read-count expectations according to the gene covariation resulting from technical, environmental, or common genetic variations. Given these expectations, the RNA-seq read counts are assumed to follow a negative binomial distribution with a gene-specific dispersion. Outliers are then identified as read counts that significantly deviate from this distribution. The model is automatically fitted to achieve the best recall of artificially corrupted data. Precision-recall analyses using simulated outlier read counts demonstrated the importance of controlling for covariation and significance-based thresholds. OUTRIDER is open source and includes functions for filtering out genes not expressed in a dataset, for identifying outlier samples with too many aberrantly expressed genes, and for detecting aberrant gene expression on the basis of false-discovery-rate-adjusted p values. Overall, OUTRIDER provides an end-to-end solution for identifying aberrantly expressed genes and is suitable for use by rare-disease diagnostic platforms.


Assuntos
Expressão Gênica/genética , Variação Genética/genética , RNA/metabolismo , Análise de Sequência de RNA/métodos , Algoritmos , Perfilação da Expressão Gênica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos
15.
Am J Hum Genet ; 100(1): 151-159, 2017 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-27989324

RESUMO

MDH2 encodes mitochondrial malate dehydrogenase (MDH), which is essential for the conversion of malate to oxaloacetate as part of the proper functioning of the Krebs cycle. We report bi-allelic pathogenic mutations in MDH2 in three unrelated subjects presenting with early-onset generalized hypotonia, psychomotor delay, refractory epilepsy, and elevated lactate in the blood and cerebrospinal fluid. Functional studies in fibroblasts from affected subjects showed both an apparently complete loss of MDH2 levels and MDH2 enzymatic activity close to null. Metabolomics analyses demonstrated a significant concomitant accumulation of the MDH substrate, malate, and fumarate, its immediate precursor in the Krebs cycle, in affected subjects' fibroblasts. Lentiviral complementation with wild-type MDH2 cDNA restored MDH2 levels and mitochondrial MDH activity. Additionally, introduction of the three missense mutations from the affected subjects into Saccharomyces cerevisiae provided functional evidence to support their pathogenicity. Disruption of the Krebs cycle is a hallmark of cancer, and MDH2 has been recently identified as a novel pheochromocytoma and paraganglioma susceptibility gene. We show that loss-of-function mutations in MDH2 are also associated with severe neurological clinical presentations in children.


Assuntos
Encefalopatias/genética , Ciclo do Ácido Cítrico , Malato Desidrogenase/genética , Mutação , Idade de Início , Alelos , Sequência de Aminoácidos , Criança , Pré-Escolar , Ciclo do Ácido Cítrico/genética , Fibroblastos/enzimologia , Fibroblastos/metabolismo , Fumaratos/metabolismo , Teste de Complementação Genética , Humanos , Lactente , Recém-Nascido , Malato Desidrogenase/química , Malato Desidrogenase/metabolismo , Malatos/metabolismo , Masculino , Metabolômica , Modelos Moleculares
16.
Mol Syst Biol ; 15(2): e8513, 2019 02 18.
Artigo em Inglês | MEDLINE | ID: mdl-30777893

RESUMO

Despite their importance in determining protein abundance, a comprehensive catalogue of sequence features controlling protein-to-mRNA (PTR) ratios and a quantification of their effects are still lacking. Here, we quantified PTR ratios for 11,575 proteins across 29 human tissues using matched transcriptomes and proteomes. We estimated by regression the contribution of known sequence determinants of protein synthesis and degradation in addition to 45 mRNA and 3 protein sequence motifs that we found by association testing. While PTR ratios span more than 2 orders of magnitude, our integrative model predicts PTR ratios at a median precision of 3.2-fold. A reporter assay provided functional support for two novel UTR motifs, and an immobilized mRNA affinity competition-binding assay identified motif-specific bound proteins for one motif. Moreover, our integrative model led to a new metric of codon optimality that captures the effects of codon frequency on protein synthesis and degradation. Altogether, this study shows that a large fraction of PTR ratio variation in human tissues can be predicted from sequence, and it identifies many new candidate post-transcriptional regulatory elements.


Assuntos
Proteínas/genética , Proteoma/genética , Distribuição Tecidual/genética , Transcriptoma/genética , Regulação da Expressão Gênica/genética , Genoma Humano/genética , Humanos , Espectrometria de Massas/métodos , Proteômica/métodos , RNA Mensageiro/genética , Análise de Sequência de RNA/métodos
17.
Mol Syst Biol ; 15(2): e8503, 2019 02 18.
Artigo em Inglês | MEDLINE | ID: mdl-30777892

RESUMO

Genome-, transcriptome- and proteome-wide measurements provide insights into how biological systems are regulated. However, fundamental aspects relating to which human proteins exist, where they are expressed and in which quantities are not fully understood. Therefore, we generated a quantitative proteome and transcriptome abundance atlas of 29 paired healthy human tissues from the Human Protein Atlas project representing human genes by 18,072 transcripts and 13,640 proteins including 37 without prior protein-level evidence. The analysis revealed that hundreds of proteins, particularly in testis, could not be detected even for highly expressed mRNAs, that few proteins show tissue-specific expression, that strong differences between mRNA and protein quantities within and across tissues exist and that protein expression is often more stable across tissues than that of transcripts. Only 238 of 9,848 amino acid variants found by exome sequencing could be confidently detected at the protein level showing that proteogenomics remains challenging, needs better computational methods and requires rigorous validation. Many uses of this resource can be envisaged including the study of gene/protein expression regulation and biomarker specificity evaluation.


Assuntos
Genoma Humano/genética , Proteoma/genética , Distribuição Tecidual/genética , Transcriptoma/genética , Regulação da Expressão Gênica/genética , Humanos , Espectrometria de Massas/métodos , Proteômica/métodos , RNA Mensageiro/genética , Análise de Sequência de RNA/métodos
18.
Hum Mutat ; 40(9): 1243-1251, 2019 09.
Artigo em Inglês | MEDLINE | ID: mdl-31070280

RESUMO

Pathogenic genetic variants often primarily affect splicing. However, it remains difficult to quantitatively predict whether and how genetic variants affect splicing. In 2018, the fifth edition of the Critical Assessment of Genome Interpretation proposed two splicing prediction challenges based on experimental perturbation assays: Vex-seq, assessing exon skipping, and MaPSy, assessing splicing efficiency. We developed a modular modeling framework, MMSplice, the performance of which was among the best on both challenges. Here we provide insights into the modeling assumptions of MMSplice and its individual modules. We furthermore illustrate how MMSplice can be applied in practice for individual genome interpretation, using the MMSplice VEP plugin and the Kipoi variant interpretation plugin, which are directly applicable to VCF files.


Assuntos
Biologia Computacional/métodos , Variação Genética , Splicing de RNA , Congressos como Assunto , Éxons , Predisposição Genética para Doença , Humanos , Íntrons , Modelos Genéticos , Software
19.
Hum Mutat ; 40(9): 1215-1224, 2019 09.
Artigo em Inglês | MEDLINE | ID: mdl-31301154

RESUMO

Precision medicine and sequence-based clinical diagnostics seek to predict disease risk or to identify causative variants from sequencing data. The Critical Assessment of Genome Interpretation (CAGI) is a community experiment consisting of genotype-phenotype prediction challenges; participants build models, undergo assessment, and share key findings. In the past, few CAGI challenges have addressed the impact of sequence variants on splicing. In CAGI5, two challenges (Vex-seq and MaPSY) involved prediction of the effect of variants, primarily single-nucleotide changes, on splicing. Although there are significant differences between these two challenges, both involved prediction of results from high-throughput exon inclusion assays. Here, we discuss the methods used to predict the impact of these variants on splicing, their performance, strengths, and weaknesses, and prospects for predicting the impact of sequence variation on splicing and disease phenotypes.


Assuntos
Processamento Alternativo , Biologia Computacional/métodos , Mutação , Proteínas/genética , Animais , Congressos como Assunto , Aptidão Genética , Humanos , Modelos Genéticos , Homologia de Sequência do Ácido Nucleico
20.
RNA ; 23(11): 1648-1659, 2017 11.
Artigo em Inglês | MEDLINE | ID: mdl-28802259

RESUMO

The stability of mRNA is one of the major determinants of gene expression. Although a wealth of sequence elements regulating mRNA stability has been described, their quantitative contributions to half-life are unknown. Here, we built a quantitative model for Saccharomyces cerevisiae based on functional mRNA sequence features that explains 59% of the half-life variation between genes and predicts half-life at a median relative error of 30%. The model revealed a new destabilizing 3' UTR motif, ATATTC, which we functionally validated. Codon usage proves to be the major determinant of mRNA stability. Nonetheless, single-nucleotide variations have the largest effect when occurring on 3' UTR motifs or upstream AUGs. Analyzing mRNA half-life data of 34 knockout strains showed that the effect of codon usage not only requires functional decapping and deadenylation, but also the 5'-to-3' exonuclease Xrn1, the nonsense-mediated decay genes, but not no-go decay. Altogether, this study quantitatively delineates the contributions of mRNA sequence features on stability in yeast, reveals their functional dependencies on degradation pathways, and allows accurate prediction of half-life from mRNA sequence.


Assuntos
Estabilidade de RNA/genética , RNA Fúngico/genética , RNA Fúngico/metabolismo , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Regiões 3' não Traduzidas/genética , Sequência de Bases , Códon/genética , Códon/metabolismo , Técnicas de Inativação de Genes , Genes Fúngicos , Meia-Vida , Modelos Biológicos , Degradação do RNAm Mediada por Códon sem Sentido/genética , Iniciação Traducional da Cadeia Peptídica , Elementos Reguladores de Transcrição , Schizosaccharomyces/genética , Schizosaccharomyces/metabolismo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA