Your browser doesn't support javascript.
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 175
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Genome Biol ; 21(1): 4, 2020 01 17.
Artigo em Inglês | MEDLINE | ID: mdl-31948480

RESUMO

BACKGROUND: RNA splicing is a key post-transcriptional mechanism that generates protein diversity and contributes to the fine-tuning of gene expression, which may facilitate adaptation to environmental challenges. Here, we employ a systems approach to study alternative splicing changes upon enteric infection in females from classical Drosophila melanogaster strains as well as 38 inbred lines. RESULTS: We find that infection leads to extensive differences in isoform ratios, which results in a more diverse transcriptome with longer 5' untranslated regions (5'UTRs). We establish a role for genetic variation in mediating inter-individual splicing differences, with local splicing quantitative trait loci (local-sQTLs) being preferentially located at the 5' end of transcripts and directly upstream of splice donor sites. Moreover, local-sQTLs are more numerous in the infected state, indicating that acute stress unmasks a substantial number of silent genetic variants. We observe a general increase in intron retention concentrated at the 5' end of transcripts across multiple strains, whose prevalence scales with the degree of pathogen virulence. The length, GC content, and RNA polymerase II occupancy of these introns with increased retention suggest that they have exon-like characteristics. We further uncover that retained intron sequences are enriched for the Lark/RBM4 RNA binding motif. Interestingly, we find that lark is induced by infection in wild-type flies, its overexpression and knockdown alter survival, and tissue-specific overexpression mimics infection-induced intron retention. CONCLUSION: Our collective findings point to pervasive and consistent RNA splicing changes, partly mediated by Lark/RBM4, as being an important aspect of the gut response to infection.

2.
Nucleic Acids Res ; 48(3): 1327-1340, 2020 Feb 20.
Artigo em Inglês | MEDLINE | ID: mdl-31879760

RESUMO

Intron retention (IR) has been proposed to modulate the delay between transcription and translation. Here, we provide an exhaustive characterization of IR in differentiated white blood cells from both the myeloid and lymphoid lineage where we observed highest levels of IR in monocytes and B-cells, in addition to previously reported granulocytes. During B-cell differentiation, we found an increase in IR from the bone marrow precursors to cells residing in secondary lymphoid organs. B-cells that undergo affinity maturation to become antibody producing plasma cells steadily decrease retention. In general, we found an inverse relationship between global IR levels and both the proliferative state of cells, and the global levels of expression of splicing factors. IR dynamics during B-cell differentiation appear to be conserved between human and mouse, suggesting that IR plays an important biological role, evolutionary conserved, during blood cell differentiation. By correlating the expression of non-core splicing factors with global IR levels, and analyzing RNA binding protein knockdown and eCLIP data, we identify a few splicing factors likely playing an evolutionary conserved role in IR regulation. Our work provides new insights into the role of IR during hematopoiesis, and on the main factors involved in regulating IR.

3.
NPJ Genom Med ; 4: 31, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31814998

RESUMO

The developmental and epileptic encephalopathies (DEE) are a group of rare, severe neurodevelopmental disorders, where even the most thorough sequencing studies leave 60-65% of patients without a molecular diagnosis. Here, we explore the incompleteness of transcript models used for exome and genome analysis as one potential explanation for a lack of current diagnoses. Therefore, we have updated the GENCODE gene annotation for 191 epilepsy-associated genes, using human brain-derived transcriptomic libraries and other data to build 3,550 putative transcript models. Our annotations increase the transcriptional 'footprint' of these genes by over 674 kb. Using SCN1A as a case study, due to its close phenotype/genotype correlation with Dravet syndrome, we screened 122 people with Dravet syndrome or a similar phenotype with a panel of exon sequences representing eight established genes and identified two de novo SCN1A variants that now - through improved gene annotation - are ascribed to residing among our exons. These two (from 122 screened people, 1.6%) molecular diagnoses carry significant clinical implications. Furthermore, we identified a previously classified SCN1A intronic Dravet syndrome-associated variant that now lies within a deeply conserved exon. Our findings illustrate the potential gains of thorough gene annotation in improving diagnostic yields for genetic disorders.

4.
Genome Res ; 29(11): 1900-1909, 2019 11.
Artigo em Inglês | MEDLINE | ID: mdl-31645363

RESUMO

MicroRNAs (miRNAs) play a critical role as posttranscriptional regulators of gene expression. The ENCODE Project profiled the expression of miRNAs in an extensive set of organs during a time-course of mouse embryonic development and captured the expression dynamics of 785 miRNAs. We found distinct organ-specific and developmental stage-specific miRNA expression clusters, with an overall pattern of increasing organ-specific expression as embryonic development proceeds. Comparative analysis of conserved miRNAs in mouse and human revealed stronger clustering of expression patterns by organ type rather than by species. An analysis of messenger RNA expression clusters compared with miRNA expression clusters identifies the potential role of specific miRNA expression clusters in suppressing the expression of mRNAs specific to other developmental programs in the organ in which these miRNAs are expressed during embryonic development. Our results provide the most comprehensive time-course of miRNA expression as part of an integrated ENCODE reference data set for mouse embryonic development.

5.
J Mol Biol ; 431(22): 4381-4407, 2019 Nov 08.
Artigo em Inglês | MEDLINE | ID: mdl-31442478

RESUMO

Selenoproteins typically contain a single selenocysteine, the 21st amino acid, encoded by a context-redefined UGA. However, human selenoprotein P (SelenoP) has a redox-functioning selenocysteine in its N-terminal domain and nine selenium transporter-functioning selenocysteines in its C-terminal domain. Here we show that diverse SelenoP genes are present across metazoa with highly variable numbers of Sec-UGAs, ranging from a single UGA in certain insects, to 9 in common spider, and up to 132 in bivalve molluscs. SelenoP genes were shaped by a dynamic evolutionary process linked to selenium usage. Gene evolution featured modular expansions of an ancestral multi-Sec domain, which led to particularly Sec-rich SelenoP proteins in many aquatic organisms. We focused on molluscs, and chose Pacific oyster Magallana gigas as experimental model. We show that oyster SelenoP mRNA with 46 UGAs is translated full-length in vivo. Ribosome profiling indicates that selenocysteine specification occurs with ∼5% efficiency at UGA1 and approaches 100% efficiency at distal 3' UGAs. We report genetic elements relevant to its expression, including a leader open reading frame and an RNA structure overlapping the initiation codon that modulates ribosome progression in a selenium-dependent manner. Unlike their mammalian counterparts, the two SECIS elements in oyster SelenoP (3'UTR recoding elements) do not show functional differentiation in vitro. Oysters can increase their tissue selenium level up to 50-fold upon supplementation, which also results in extensive changes in selenoprotein expression.

6.
Nucleic Acids Res ; 47(10): 5293-5306, 2019 06 04.
Artigo em Inglês | MEDLINE | ID: mdl-30916337

RESUMO

Nonsense-mediated decay (NMD) is a eukaryotic mRNA surveillance system that selectively degrades transcripts with premature termination codons (PTC). Many RNA-binding proteins (RBP) regulate their expression levels by a negative feedback loop, in which RBP binds its own pre-mRNA and causes alternative splicing to introduce a PTC. We present a bioinformatic analysis integrating three data sources, eCLIP assays for a large RBP panel, shRNA inactivation of NMD pathway, and shRNA-depletion of RBPs followed by RNA-seq, to identify novel such autoregulatory feedback loops. We show that RBPs frequently bind their own pre-mRNAs, their exons respond prominently to NMD pathway disruption, and that the responding exons are enriched with nearby eCLIP peaks. We confirm previously proposed models of autoregulation in SRSF7 and U2AF1 genes and present two novel models, in which (i) SFPQ binds its mRNA and promotes switching to an alternative distal 3'-UTR that is targeted by NMD, and (ii) RPS3 binding activates a poison 5'-splice site in its pre-mRNA that leads to a frame shift and degradation by NMD. We also suggest specific splicing events that could be implicated in autoregulatory feedback loops in RBM39, HNRNPM, and U2AF2 genes. The results are available through a UCSC Genome Browser track hub.


Assuntos
Códon sem Sentido , Degradação do RNAm Mediada por Códon sem Sentido , Processamento de RNA , RNA Interferente Pequeno/metabolismo , Transcriptoma , Regiões 3' não Traduzidas , Processamento Alternativo , Biologia Computacional , Éxons , Mutação da Fase de Leitura , Ribonucleoproteínas Nucleares Heterogêneas Grupo M/metabolismo , Humanos , Proteínas Nucleares/metabolismo , Precursores de RNA/metabolismo , RNA Mensageiro/metabolismo , Proteínas de Ligação a RNA/metabolismo , Fatores de Processamento de Serina-Arginina/metabolismo , Spliceossomos , Fator de Processamento U2AF/metabolismo
8.
Nucleic Acids Res ; 47(D1): D766-D773, 2019 Jan 08.
Artigo em Inglês | MEDLINE | ID: mdl-30357393

RESUMO

The accurate identification and description of the genes in the human and mouse genomes is a fundamental requirement for high quality analysis of data informing both genome biology and clinical genomics. Over the last 15 years, the GENCODE consortium has been producing reference quality gene annotations to provide this foundational resource. The GENCODE consortium includes both experimental and computational biology groups who work together to improve and extend the GENCODE gene annotation. Specifically, we generate primary data, create bioinformatics tools and provide analysis to support the work of expert manual gene annotators and automated gene annotation pipelines. In addition, manual and computational annotation workflows use any and all publicly available data and analysis, along with the research literature to identify and characterise gene loci to the highest standard. GENCODE gene annotations are accessible via the Ensembl and UCSC Genome Browsers, the Ensembl FTP site, Ensembl Biomart, Ensembl Perl and REST APIs as well as https://www.gencodegenes.org.

9.
Genome Res ; 28(12): 1852-1866, 2018 12.
Artigo em Inglês | MEDLINE | ID: mdl-30459214

RESUMO

One of the most important questions in regenerative biology is to unveil how and when genes change expression and trigger regeneration programs. The resetting of gene expression patterns during response to injury is governed by coordinated actions of genomic regions that control the activity of multiple sequence-specific DNA binding proteins. Using genome-wide approaches to interrogate chromatin function, we here identify the elements that regulate tissue recovery in Drosophila imaginal discs, which show a high regenerative capacity after genetically induced cell death. Our findings indicate there is global coregulation of gene expression as well as a regeneration program driven by different types of regulatory elements. Novel enhancers acting exclusively within damaged tissue cooperate with enhancers co-opted from other tissues and other developmental stages, as well as with endogenous enhancers that show increased activity after injury. Together, these enhancers host binding sites for regulatory proteins that include a core set of conserved transcription factors that control regeneration across metazoans.


Assuntos
Drosophila/fisiologia , Regulação da Expressão Gênica , Regeneração/genética , Elementos de Resposta , Animais , Cromatina/genética , Sequência Conservada , Perfilação da Expressão Gênica , Transdução de Sinais , Transcrição Genética , Ativação Transcricional , Transcriptoma
10.
Curr Protoc Bioinformatics ; 64(1): e56, 2018 12.
Artigo em Inglês | MEDLINE | ID: mdl-30332532

RESUMO

This unit describes the usage of geneid, an efficient gene-finding program that allows for the analysis of large genomic sequences, including whole mammalian chromosomes. These sequences can be partially annotated, and geneid can be used to refine this initial annotation. Training geneid is relatively easy, and parameter configurations exist for a number of eukaryotic species. geneid produces output in a variety of standard formats. The results, thus, can be processed by a variety of software tools, including visualization programs. geneid software is in the public domain, and is undergoing constant development. It is easy to install and use. Exhaustive benchmark evaluations show that geneid compares favorably with other existing gene-finding tools. © 2018 by John Wiley & Sons, Inc.


Assuntos
Biologia Computacional/métodos , Genes , Software , Processamento Alternativo/genética , Sequência de Aminoácidos , Sequência de Bases , Éxons/genética , Genômica , Guias como Assunto , Íntrons/genética
11.
PLoS Comput Biol ; 14(8): e1006360, 2018 08.
Artigo em Inglês | MEDLINE | ID: mdl-30118475

RESUMO

We present ggsashimi, a command-line tool for the visualization of splicing events across multiple samples. Given a specified genomic region, ggsashimi creates sashimi plots for individual RNA-seq experiments as well as aggregated plots for groups of experiments, a feature unique to this software. Compared to the existing versions of programs generating sashimi plots, it uses popular bioinformatics file formats, it is annotation-independent, and allows the visualization of splicing events even for large genomic regions by scaling down the genomic segments between splice sites. ggsashimi is freely available at https://github.com/guigolab/ggsashimi. It is implemented in python, and internally generates R code for plotting.


Assuntos
Biologia Computacional/métodos , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos , Processamento Alternativo , Animais , Computadores , Genoma , Genômica , Humanos , Processamento de RNA/fisiologia , Software
12.
Nat Genet ; 50(9): 1327-1334, 2018 09.
Artigo em Inglês | MEDLINE | ID: mdl-30127527

RESUMO

Coding variants represent many of the strongest associations between genotype and phenotype; however, they exhibit inter-individual differences in effect, termed 'variable penetrance'. Here, we study how cis-regulatory variation modifies the penetrance of coding variants. Using functional genomic and genetic data from the Genotype-Tissue Expression Project (GTEx), we observed that in the general population, purifying selection has depleted haplotype combinations predicted to increase pathogenic coding variant penetrance. Conversely, in cancer and autism patients, we observed an enrichment of penetrance increasing haplotype configurations for pathogenic variants in disease-implicated genes, providing evidence that regulatory haplotype configuration of coding variants affects disease risk. Finally, we experimentally validated this model by editing a Mendelian single-nucleotide polymorphism (SNP) using CRISPR/Cas9 on distinct expression haplotypes with the transcriptome as a phenotypic readout. Our results demonstrate that joint regulatory and coding variant effects are an important part of the genetic architecture of human traits and contribute to modified penetrance of disease-causing variants.


Assuntos
Doença/genética , Predisposição Genética para Doença , Polimorfismo de Nucleotídeo Único , Sistemas CRISPR-Cas , Genoma Humano , Haplótipos , Humanos , Fenótipo , Locos de Características Quantitativas , Transcriptoma
13.
Nat Rev Genet ; 19(9): 535-548, 2018 09.
Artigo em Inglês | MEDLINE | ID: mdl-29795125

RESUMO

Gene maps, or annotations, enable us to navigate the functional landscape of our genome. They are a resource upon which virtually all studies depend, from single-gene to genome-wide scales and from basic molecular biology to medical genetics. Yet present-day annotations suffer from trade-offs between quality and size, with serious but often unappreciated consequences for downstream studies. This is particularly true for long non-coding RNAs (lncRNAs), which are poorly characterized compared to protein-coding genes. Long-read sequencing technologies promise to improve current annotations, paving the way towards a complete annotation of lncRNAs expressed throughout a human lifetime.


Assuntos
Mapeamento Cromossômico , Perfilação da Expressão Gênica , Genoma Humano , RNA Longo não Codificante , Transcriptoma/fisiologia , Estudo de Associação Genômica Ampla , Humanos , RNA Longo não Codificante/biossíntese , RNA Longo não Codificante/genética
14.
Nat Med ; 24(6): 868-880, 2018 06.
Artigo em Inglês | MEDLINE | ID: mdl-29785028

RESUMO

Chronic lymphocytic leukemia (CLL) is a frequent hematological neoplasm in which underlying epigenetic alterations are only partially understood. Here, we analyze the reference epigenome of seven primary CLLs and the regulatory chromatin landscape of 107 primary cases in the context of normal B cell differentiation. We identify that the CLL chromatin landscape is largely influenced by distinct dynamics during normal B cell maturation. Beyond this, we define extensive catalogues of regulatory elements de novo reprogrammed in CLL as a whole and in its major clinico-biological subtypes classified by IGHV somatic hypermutation levels. We uncover that IGHV-unmutated CLLs harbor more active and open chromatin than IGHV-mutated cases. Furthermore, we show that de novo active regions in CLL are enriched for NFAT, FOX and TCF/LEF transcription factor family binding sites. Although most genetic alterations are not associated with consistent epigenetic profiles, CLLs with MYD88 mutations and trisomy 12 show distinct chromatin configurations. Furthermore, we observe that non-coding mutations in IGHV-mutated CLLs are enriched in H3K27ac-associated regulatory elements outside accessible chromatin. Overall, this study provides an integrative portrait of the CLL epigenome, identifies extensive networks of altered regulatory elements and sheds light on the relationship between the genetic and epigenetic architecture of the disease.


Assuntos
Cromatina/metabolismo , Epigenômica , Leucemia Linfocítica Crônica de Células B/genética , Linfócitos B/metabolismo , Sequência de Bases , Estudos de Coortes , Humanos
15.
Aging Cell ; 17(4): e12740, 2018 08.
Artigo em Inglês | MEDLINE | ID: mdl-29671950

RESUMO

Lifespan varies dramatically among species, but the biological basis is not well understood. Previous studies in model organisms revealed the importance of nutrient sensing, mTOR, NAD/sirtuins, and insulin/IGF1 signaling in lifespan control. By studying life-history traits and transcriptomes of 14 Drosophila species differing more than sixfold in lifespan, we explored expression divergence and identified genes and processes that correlate with longevity. These longevity signatures suggested that longer-lived flies upregulate fatty acid metabolism, downregulate neuronal system development and activin signaling, and alter dynamics of RNA splicing. Interestingly, these gene expression patterns resembled those of flies under dietary restriction and several other lifespan-extending interventions, although on the individual gene level, there was no significant overlap with genes previously reported to have lifespan-extension effects. We experimentally tested the lifespan regulation potential of several candidate genes and found no consistent effects, suggesting that individual genes generally do not explain the observed longevity patterns. Instead, it appears that lifespan regulation across species is modulated by complex relationships at the system level represented by global gene expression.


Assuntos
Drosophila/classificação , Drosophila/genética , Longevidade/genética , Transcriptoma , Animais , Especificidade da Espécie
16.
Nat Commun ; 9(1): 490, 2018 02 13.
Artigo em Inglês | MEDLINE | ID: mdl-29440659

RESUMO

Post-mortem tissues samples are a key resource for investigating patterns of gene expression. However, the processes triggered by death and the post-mortem interval (PMI) can significantly alter physiologically normal RNA levels. We investigate the impact of PMI on gene expression using data from multiple tissues of post-mortem donors obtained from the GTEx project. We find that many genes change expression over relatively short PMIs in a tissue-specific manner, but this potentially confounding effect in a biological analysis can be minimized by taking into account appropriate covariates. By comparing ante- and post-mortem blood samples, we identify the cascade of transcriptional events triggered by death of the organism. These events do not appear to simply reflect stochastic variation resulting from mRNA degradation, but active and ongoing regulation of transcription. Finally, we develop a model to predict the time since death from the analysis of the transcriptome of a few readily accessible tissues.


Assuntos
Isquemia Fria , Morte , Mudanças Depois da Morte , Transcriptoma , Sangue , Feminino , Expressão Gênica , Humanos , Modelos Biológicos , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Processos Estocásticos
17.
Nucleic Acids Res ; 46(3): e15, 2018 02 16.
Artigo em Inglês | MEDLINE | ID: mdl-29155959

RESUMO

Small non-coding RNAs (sncRNAs) are highly abundant molecules that regulate essential cellular processes and are classified according to sequence and structure. Here we argue that read profiles from size-selected RNA sequencing capture the post-transcriptional processing specific to each RNA family, thereby providing functional information independently of sequence and structure. We developed SeRPeNT, a new computational method that exploits reproducibility across replicates and uses dynamic time-warping and density-based clustering algorithms to identify, characterize and compare sncRNAs by harnessing the power of read profiles. We applied SeRPeNT to: (i) generate an extended human annotation with 671 new sncRNAs from known classes and 131 from new potential classes, (ii) show pervasive differential processing of sncRNAs between cell compartments and (iii) predict new molecules with miRNA-like behaviour from snoRNA, tRNA and long non-coding RNA precursors, potentially dependent on the miRNA biogenesis pathway. Furthermore, we validated experimentally four predicted novel non-coding RNAs: a miRNA, a snoRNA-derived miRNA, a processed tRNA and a new uncharacterized sncRNA. SeRPeNT facilitates fast and accurate discovery and characterization of sncRNAs at an unprecedented scale. SeRPeNT code is available under the MIT license at https://github.com/comprna/SeRPeNT.


Assuntos
Algoritmos , MicroRNAs/genética , RNA Longo não Codificante/genética , RNA Nucleolar Pequeno/genética , Pequeno RNA não Traduzido/genética , RNA de Transferência/genética , Sequência de Bases , Análise por Conglomerados , Perfil Genético , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Internet , MicroRNAs/classificação , Anotação de Sequência Molecular , RNA Longo não Codificante/classificação , RNA Nucleolar Pequeno/classificação , Pequeno RNA não Traduzido/classificação , RNA de Transferência/classificação , Reprodutibilidade dos Testes , Software
18.
Methods Mol Biol ; 1661: 17-28, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-28917034

RESUMO

Selenoproteins contain selenocysteine (Sec or U), the 21st amino acid, inserted in response to an in-frame UGA codon. UGA normally terminates translation, but in selenoprotein mRNAs it is recoded to specify Sec insertion. For this reason, standard gene prediction programs fail to predict Sec codons, and selenoproteins are usually misannotated in protein databases and genome projects. Selenoprofiles is a computational pipeline able to correctly annotate selenoprotein genes in genomic sequences. This program uses a SECIS-independent approach, based on homology searches, and employs curated built-in profile alignments for all known selenoprotein families. Selenoprofiles constitutes the most accurate method for predicting selenoprotein genes belonging to known families.


Assuntos
Biologia Computacional/métodos , Selenoproteínas/genética , Software , Códon de Terminação , Bases de Dados de Proteínas , Genômica/métodos , Anotação de Sequência Molecular , Selenocisteína/química , Selenocisteína/genética , Selenoproteínas/química , Interface Usuário-Computador , Navegador
19.
F1000Res ; 72018.
Artigo em Inglês | MEDLINE | ID: mdl-30613379

RESUMO

At the beginning of this century, the Human Genome Project produced the first drafts of the human genome sequence. Following this, large-scale functional genomics studies were initiated to understand the molecular basis underlying the translation of the instructions encoded in the genome into the biological traits of organisms. Instrumental in the ensuing revolution in functional genomics were the rapid advances in massively parallel sequencing technologies as well as the development of a wide diversity of protocols that make use of these technologies to understand cellular behavior at the molecular level. Here, we review recent advances in functional genomic methods, discuss some of their current capabilities and limitations, and briefly sketch future directions within the field.


Assuntos
Genômica/tendências , Genoma Humano/genética , Genômica/métodos , Projeto Genoma Humano , Humanos , Análise de Sequência de DNA/métodos , Análise de Sequência de DNA/tendências
20.
Nat Genet ; 49(12): 1731-1740, 2017 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-29106417

RESUMO

Accurate annotation of genes and their transcripts is a foundation of genomics, but currently no annotation technique combines throughput and accuracy. As a result, reference gene collections remain incomplete-many gene models are fragmentary, and thousands more remain uncataloged, particularly for long noncoding RNAs (lncRNAs). To accelerate lncRNA annotation, the GENCODE consortium has developed RNA Capture Long Seq (CLS), which combines targeted RNA capture with third-generation long-read sequencing. Here we present an experimental reannotation of the GENCODE intergenic lncRNA populations in matched human and mouse tissues that resulted in novel transcript models for 3,574 and 561 gene loci, respectively. CLS approximately doubled the annotated complexity of targeted loci, outperforming existing short-read techniques. Full-length transcript models produced by CLS enabled us to definitively characterize the genomic features of lncRNAs, including promoter and gene structure, and protein-coding potential. Thus, CLS removes a long-standing bottleneck in transcriptome annotation and generates manual-quality full-length transcript models at high-throughput scales.


Assuntos
Biologia Computacional/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Anotação de Sequência Molecular/métodos , RNA Longo não Codificante/genética , Animais , Perfilação da Expressão Gênica/métodos , Genômica/métodos , Humanos , Camundongos , Fases de Leitura Aberta/genética , Reprodutibilidade dos Testes
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA