Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 18 de 18
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
2.
bioRxiv ; 2023 Jun 18.
Artigo em Inglês | MEDLINE | ID: mdl-37398314

RESUMO

Long-read RNA sequencing is essential to produce accurate and exhaustive annotation of eukaryotic genomes. Despite advancements in throughput and accuracy, achieving reliable end-to-end identification of RNA transcripts remains a challenge for long-read sequencing methods. To address this limitation, we developed CapTrap-seq, a cDNA library preparation method, which combines the Cap-trapping strategy with oligo(dT) priming to detect 5'capped, full-length transcripts, together with the data processing pipeline LyRic. We benchmarked CapTrap-seq and other popular RNA-seq library preparation protocols in a number of human tissues using both ONT and PacBio sequencing. To assess the accuracy of the transcript models produced, we introduced a capping strategy for synthetic RNA spike-in sequences that mimics the natural 5'cap formation in RNA spike-in molecules. We found that the vast majority (up to 90%) of transcript models that LyRic derives from CapTrap-seq reads are full-length. This makes it possible to produce highly accurate annotations with minimal human intervention.

3.
Nucleic Acids Res ; 51(D1): D942-D949, 2023 01 06.
Artigo em Inglês | MEDLINE | ID: mdl-36420896

RESUMO

GENCODE produces high quality gene and transcript annotation for the human and mouse genomes. All GENCODE annotation is supported by experimental data and serves as a reference for genome biology and clinical genomics. The GENCODE consortium generates targeted experimental data, develops bioinformatic tools and carries out analyses that, along with externally produced data and methods, support the identification and annotation of transcript structures and the determination of their function. Here, we present an update on the annotation of human and mouse genes, including developments in the tools, data, analyses and major collaborations which underpin this progress. For example, we report the creation of a set of non-canonical ORFs identified in GENCODE transcripts, the LRGASP collaboration to assess the use of long transcriptomic data to build transcript models, the progress in collaborations with RefSeq and UniProt to increase convergence in the annotation of human and mouse protein-coding genes, the propagation of GENCODE across the human pan-genome and the development of new tools to support annotation of regulatory features by GENCODE. Our annotation is accessible via Ensembl, the UCSC Genome Browser and https://www.gencodegenes.org.


Assuntos
Biologia Computacional , Genoma Humano , Humanos , Animais , Camundongos , Anotação de Sequência Molecular , Biologia Computacional/métodos , Genoma Humano/genética , Transcriptoma/genética , Perfilação da Expressão Gênica , Bases de Dados Genéticas
4.
Life Sci Alliance ; 6(1)2023 01.
Artigo em Inglês | MEDLINE | ID: mdl-36283702

RESUMO

Most mitochondrial proteins are encoded by nuclear genes, synthetized in the cytosol and targeted into the organelle. To characterize the spatial organization of mitochondrial gene products in zebrafish (Danio rerio), we sequenced RNA from different cellular fractions. Our results confirmed the presence of nuclear-encoded mRNAs in the mitochondrial fraction, which in unperturbed conditions, are mainly transcripts encoding large proteins with specific properties, like transmembrane domains. To further explore the principles of mitochondrial protein compartmentalization in zebrafish, we quantified the transcriptomic changes for each subcellular fraction triggered by the chchd4a -/- mutation, causing the disorders in the mitochondrial protein import. Our results indicate that the proteostatic stress further restricts the population of transcripts on the mitochondrial surface, allowing only the largest and the most evolutionary conserved proteins to be synthetized there. We also show that many nuclear-encoded mitochondrial transcripts translated by the cytosolic ribosomes stay resistant to the global translation shutdown. Thus, vertebrates, in contrast to yeast, are not likely to use localized translation to facilitate synthesis of mitochondrial proteins under proteostatic stress conditions.


Assuntos
Genes Mitocondriais , Peixe-Zebra , Animais , Peixe-Zebra/genética , Proteínas Mitocondriais/genética , Proteínas Mitocondriais/metabolismo , Mitocôndrias/genética , Mitocôndrias/metabolismo , RNA Mensageiro/genética , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Proteínas Nucleares/genética
5.
Elife ; 102021 07 20.
Artigo em Inglês | MEDLINE | ID: mdl-34292154

RESUMO

Mitochondria are organelles with their own genomes, but they rely on the import of nuclear-encoded proteins that are translated by cytosolic ribosomes. Therefore, it is important to understand whether failures in the mitochondrial uptake of these nuclear-encoded proteins can cause proteotoxic stress and identify response mechanisms that may counteract it. Here, we report that upon impairments in mitochondrial protein import, high-risk precursor and immature forms of mitochondrial proteins form aberrant deposits in the cytosol. These deposits then cause further cytosolic accumulation and consequently aggregation of other mitochondrial proteins and disease-related proteins, including α-synuclein and amyloid ß. This aggregation triggers a cytosolic protein homeostasis imbalance that is accompanied by specific molecular chaperone responses at both the transcriptomic and protein levels. Altogether, our results provide evidence that mitochondrial dysfunction, specifically protein import defects, contributes to impairments in protein homeostasis, thus revealing a possible molecular mechanism by which mitochondria are involved in neurodegenerative diseases.


Assuntos
Doença de Alzheimer/metabolismo , Citosol/metabolismo , Mitocôndrias/metabolismo , Proteínas Mitocondriais/metabolismo , Agregados Proteicos , Proteostase , Proteínas de Saccharomyces cerevisiae/metabolismo , Saccharomyces cerevisiae/metabolismo , Doença de Alzheimer/genética , Animais , Caenorhabditis elegans/genética , Caenorhabditis elegans/metabolismo , Proteínas de Caenorhabditis elegans/genética , Proteínas de Caenorhabditis elegans/metabolismo , Bases de Dados Genéticas , Proteínas de Choque Térmico/genética , Proteínas de Choque Térmico/metabolismo , Humanos , Mitocôndrias/genética , Proteínas Mitocondriais/genética , Chaperonas Moleculares/genética , Chaperonas Moleculares/metabolismo , Transporte Proteico , Saccharomyces cerevisiae/genética , Proteínas de Saccharomyces cerevisiae/genética
6.
Nucleic Acids Res ; 49(D1): D916-D923, 2021 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-33270111

RESUMO

The GENCODE project annotates human and mouse genes and transcripts supported by experimental data with high accuracy, providing a foundational resource that supports genome biology and clinical genomics. GENCODE annotation processes make use of primary data and bioinformatic tools and analysis generated both within the consortium and externally to support the creation of transcript structures and the determination of their function. Here, we present improvements to our annotation infrastructure, bioinformatics tools, and analysis, and the advances they support in the annotation of the human and mouse genomes including: the completion of first pass manual annotation for the mouse reference genome; targeted improvements to the annotation of genes associated with SARS-CoV-2 infection; collaborative projects to achieve convergence across reference annotation databases for the annotation of human and mouse protein-coding genes; and the first GENCODE manually supervised automated annotation of lncRNAs. Our annotation is accessible via Ensembl, the UCSC Genome Browser and https://www.gencodegenes.org.


Assuntos
COVID-19/prevenção & controle , Biologia Computacional/métodos , Bases de Dados Genéticas , Genômica/métodos , Anotação de Sequência Molecular/métodos , SARS-CoV-2/genética , Animais , COVID-19/epidemiologia , COVID-19/virologia , Epidemias , Humanos , Internet , Camundongos , Pseudogenes/genética , RNA Longo não Codificante/genética , SARS-CoV-2/metabolismo , SARS-CoV-2/fisiologia , Transcrição Gênica/genética
7.
Methods Mol Biol ; 2254: 133-159, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33326074

RESUMO

Metazoan genomes produce thousands of long-noncoding RNAs (lncRNAs), of which just a small fraction have been well characterized. Understanding their biological functions requires accurate annotations, or maps of the precise location and structure of genes and transcripts in the genome. Current lncRNA annotations are limited by compromises between quality and size, with many gene models being fragmentary or uncatalogued. To overcome this, the GENCODE consortium has developed RNA capture long-read sequencing (CLS), an approach combining targeted RNA capture with third-generation long-read sequencing. CLS provides accurate annotations at high-throughput rates. It eliminates the need for noisy transcriptome assembly from short reads, and requires minimal manual curation. The full-length transcript models produced are of quality comparable to present-day manually curated annotations. Here we describe a detailed CLS protocol, from probe design through long-read sequencing to creation of final annotations.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala/métodos , Anotação de Sequência Molecular/métodos , RNA Longo não Codificante/genética , Animais , Biologia Computacional/métodos , Curadoria de Dados , Análise de Sequência de RNA
8.
Pharmacol Res ; 161: 105249, 2020 11.
Artigo em Inglês | MEDLINE | ID: mdl-33068730

RESUMO

The molecular complexity of human breast cancer (BC) renders the clinical management of the disease challenging. Long non-coding RNAs (lncRNAs) are promising biomarkers for BC patient stratification, early detection, and disease monitoring. Here, we identified the involvement of the long intergenic non-coding RNA 01087 (LINC01087) in breast oncogenesis. LINC01087 appeared significantly downregulated in triple-negative BCs (TNBCs) and upregulated in the luminal BC subtypes in comparison to mammary samples from cancer-free women and matched normal cancer pairs. Interestingly, deregulation of LINC01087 allowed to accurately distinguish between luminal and TNBC specimens, independently of the clinicopathological parameters, and of the histological and TP53 or BRCA1/2 mutational status. Moreover, increased expression of LINC01087 predicted a better prognosis in luminal BCs, while TNBC tumors that harbored lower levels of LINC01087 were associated with reduced relapse-free survival. Furthermore, bioinformatics analyses were performed on TNBC and luminal BC samples and suggested that the putative tumor suppressor activity of LINC01087 may rely on interferences with pathways involved in cell survival, proliferation, adhesion, invasion, inflammation and drug sensitivity. Altogether, these data suggest that the assessment of LINC01087 deregulation could represent a novel, specific and promising biomarker not only for the diagnosis and prognosis of luminal BC subtypes and TNBCs, but also as a predictive biomarker of pharmacological interventions.


Assuntos
Biomarcadores Tumorais/metabolismo , Neoplasias da Mama/metabolismo , RNA Longo não Codificante/metabolismo , Neoplasias de Mama Triplo Negativas/metabolismo , Biomarcadores Tumorais/genética , Neoplasias da Mama/tratamento farmacológico , Neoplasias da Mama/genética , Neoplasias da Mama/patologia , Feminino , Perfilação da Expressão Gênica , Regulação Neoplásica da Expressão Gênica , Redes Reguladoras de Genes , Humanos , Células MCF-7 , Metástase Neoplásica , Recidiva Local de Neoplasia , Intervalo Livre de Progressão , Mapas de Interação de Proteínas , RNA Longo não Codificante/genética , Transdução de Sinais , Fatores de Tempo , Transcriptoma , Neoplasias de Mama Triplo Negativas/tratamento farmacológico , Neoplasias de Mama Triplo Negativas/genética , Neoplasias de Mama Triplo Negativas/patologia
9.
NPJ Genom Med ; 4: 31, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31814998

RESUMO

The developmental and epileptic encephalopathies (DEE) are a group of rare, severe neurodevelopmental disorders, where even the most thorough sequencing studies leave 60-65% of patients without a molecular diagnosis. Here, we explore the incompleteness of transcript models used for exome and genome analysis as one potential explanation for a lack of current diagnoses. Therefore, we have updated the GENCODE gene annotation for 191 epilepsy-associated genes, using human brain-derived transcriptomic libraries and other data to build 3,550 putative transcript models. Our annotations increase the transcriptional 'footprint' of these genes by over 674 kb. Using SCN1A as a case study, due to its close phenotype/genotype correlation with Dravet syndrome, we screened 122 people with Dravet syndrome or a similar phenotype with a panel of exon sequences representing eight established genes and identified two de novo SCN1A variants that now - through improved gene annotation - are ascribed to residing among our exons. These two (from 122 screened people, 1.6%) molecular diagnoses carry significant clinical implications. Furthermore, we identified a previously classified SCN1A intronic Dravet syndrome-associated variant that now lies within a deeply conserved exon. Our findings illustrate the potential gains of thorough gene annotation in improving diagnostic yields for genetic disorders.

10.
J Cell Sci ; 132(8)2019 04 26.
Artigo em Inglês | MEDLINE | ID: mdl-31028152

RESUMO

The production of newly synthesized proteins is vital for all cellular functions and is a determinant of cell growth and proliferation. The synthesis of polypeptide chains from mRNA molecules requires sophisticated machineries and mechanisms that need to be tightly regulated, and adjustable to current needs of the cell. Failures in the regulation of translation contribute to the loss of protein homeostasis, which can have deleterious effects on cellular function and organismal health. Unsurprisingly, the regulation of translation appears to be a crucial element in stress response mechanisms. This review provides an overview of mechanisms that modulate cytosolic protein synthesis upon cellular stress, with a focus on the attenuation of translation in response to mitochondrial stress. We then highlight links between mitochondrion-derived reactive oxygen species and the attenuation of reversible cytosolic translation through the oxidation of ribosomal proteins at their cysteine residues. We also discuss emerging concepts of how cellular mechanisms to stress are adapted, including the existence of alternative ribosomes and stress granules, and the regulation of co-translational import upon organelle stress.


Assuntos
Mitocôndrias/metabolismo , Biossíntese de Proteínas , Ribossomos/metabolismo , Processos de Crescimento Celular , Cisteína/metabolismo , Humanos , Mitocôndrias/genética , Estresse Oxidativo , Proteostase , Espécies Reativas de Oxigênio/metabolismo , Proteínas Ribossômicas/metabolismo , Ribossomos/genética , Transdução de Sinais
12.
Nucleic Acids Res ; 47(D1): D766-D773, 2019 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-30357393

RESUMO

The accurate identification and description of the genes in the human and mouse genomes is a fundamental requirement for high quality analysis of data informing both genome biology and clinical genomics. Over the last 15 years, the GENCODE consortium has been producing reference quality gene annotations to provide this foundational resource. The GENCODE consortium includes both experimental and computational biology groups who work together to improve and extend the GENCODE gene annotation. Specifically, we generate primary data, create bioinformatics tools and provide analysis to support the work of expert manual gene annotators and automated gene annotation pipelines. In addition, manual and computational annotation workflows use any and all publicly available data and analysis, along with the research literature to identify and characterise gene loci to the highest standard. GENCODE gene annotations are accessible via the Ensembl and UCSC Genome Browsers, the Ensembl FTP site, Ensembl Biomart, Ensembl Perl and REST APIs as well as https://www.gencodegenes.org.


Assuntos
Bases de Dados Genéticas , Genoma Humano/genética , Genômica , Pseudogenes/genética , Animais , Biologia Computacional , Humanos , Internet , Camundongos , Anotação de Sequência Molecular , Software
13.
PLoS Genet ; 14(11): e1007743, 2018 11.
Artigo em Inglês | MEDLINE | ID: mdl-30457989

RESUMO

Development and function of tissues and organs are powered by the activity of mitochondria. In humans, inherited genetic mutations that lead to progressive mitochondrial pathology often manifest during infancy and can lead to death, reflecting the indispensable nature of mitochondrial biogenesis and function. Here, we describe a zebrafish mutant for the gene mia40a (chchd4a), the life-essential homologue of the evolutionarily conserved Mia40 oxidoreductase which drives the biogenesis of cysteine-rich mitochondrial proteins. We report that mia40a mutant animals undergo progressive cellular respiration defects and develop enlarged mitochondria in skeletal muscles before their ultimate death at the larval stage. We generated a deep transcriptomic and proteomic resource that allowed us to identify abnormalities in the development and physiology of endodermal organs, in particular the liver and pancreas. We identify the acinar cells of the exocrine pancreas to be severely affected by mutations in the MIA pathway. Our data contribute to a better understanding of the molecular, cellular and organismal effects of mitochondrial deficiency, important for the accurate diagnosis and future treatment strategies of mitochondrial diseases.

14.
Nat Rev Genet ; 19(9): 535-548, 2018 09.
Artigo em Inglês | MEDLINE | ID: mdl-29795125

RESUMO

Gene maps, or annotations, enable us to navigate the functional landscape of our genome. They are a resource upon which virtually all studies depend, from single-gene to genome-wide scales and from basic molecular biology to medical genetics. Yet present-day annotations suffer from trade-offs between quality and size, with serious but often unappreciated consequences for downstream studies. This is particularly true for long non-coding RNAs (lncRNAs), which are poorly characterized compared to protein-coding genes. Long-read sequencing technologies promise to improve current annotations, paving the way towards a complete annotation of lncRNAs expressed throughout a human lifetime.


Assuntos
Mapeamento Cromossômico , Perfilação da Expressão Gênica , Genoma Humano , RNA Longo não Codificante , Transcriptoma/fisiologia , Estudo de Associação Genômica Ampla , Humanos , RNA Longo não Codificante/biossíntese , RNA Longo não Codificante/genética
15.
Int J Oncol ; 52(3): 656-678, 2018 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-29286103

RESUMO

Acute myeloid leukemia (AML) is the most common and severe form of acute leukemia diagnosed in adults. Owing to its heterogeneity, AML is divided into classes associated with different treatment outcomes and specific gene expression profiles. Based on previous studies on AML, in this study, we designed and generated an AML-array containing 900 oligonucleotide probes complementary to human genes implicated in hematopoietic cell differentiation and maturation, proliferation, apoptosis and leukemic transformation. The AML-array was used to hybridize 118 samples from 33 patients with AML of the M1 and M2 subtypes of the French-American­British (FAB) classification and 15 healthy volunteers (HV). Rigorous analysis of the microarray data revealed that 83 genes were differentially expressed between the patients with AML and the HV, including genes not yet discussed in the context of AML pathogenesis. The most overexpressed genes in AML were STMN1, KITLG, CDK6, MCM5, KRAS, CEBPA, MYC, ANGPT1, SRGN, RPLP0, ENO1 and SET, whereas the most underexpressed genes were IFITM1, LTB, FCN1, BIRC3, LYZ, ADD3, S100A9, FCER1G, PTRPE, CD74 and TMSB4X. The overexpression of the CPA3 gene was specific for AML with mutated NPM1 and FLT3. Although the microarray-based method was insufficient to differentiate between any other AML subgroups, quantitative PCR approaches enabled us to identify 3 genes (ANXA3, S100A9 and WT1) whose expression can be used to discriminate between the 2 studied AML FAB subtypes. The expression levels of the ANXA3 and S100A9 genes were increased, whereas those of WT1 were decreased in the AML-M2 compared to the AML-M1 group. We also examined the association between the STMN1, CAT and ABL1 genes, and the FLT3 and NPM1 mutation status. FLT3+/NPM1- AML was associated with the highest expression of STMN1, and ABL1 was upregulated in FLT3+ AML and CAT in FLT3- AML, irrespectively of the NPM1 mutation status. Moreover, our results indicated that CAT and WT1 gene expression levels correlated with the response to therapy. CAT expression was highest in patients who remained longer under complete remission, whereas WT1 expression increased with treatment resistance. On the whole, this study demonstrates that the AML-array can potentially serve as a first-line screening tool, and may be helpful for the diagnosis of AML, whereas the differentiation between AML subgroups can be more successfully performed with PCR-based analysis of a few marker genes.


Assuntos
Protocolos de Quimioterapia Combinada Antineoplásica/uso terapêutico , Biomarcadores Tumorais/genética , Perfilação da Expressão Gênica/métodos , Leucemia Mieloide Aguda/genética , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Adolescente , Adulto , Idoso , Catalase/genética , Catalase/metabolismo , Resistencia a Medicamentos Antineoplásicos/genética , Feminino , Humanos , Leucemia Mieloide Aguda/diagnóstico , Leucemia Mieloide Aguda/tratamento farmacológico , Leucemia Mieloide Aguda/patologia , Masculino , Pessoa de Meia-Idade , Mutação , Nucleofosmina , Prognóstico , Reação em Cadeia da Polimerase em Tempo Real/métodos , Indução de Remissão/métodos , Análise de Sequência de RNA/métodos , Resultado do Tratamento , Proteínas WT1/genética , Proteínas WT1/metabolismo , Adulto Jovem
16.
Nat Genet ; 49(12): 1731-1740, 2017 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-29106417

RESUMO

Accurate annotation of genes and their transcripts is a foundation of genomics, but currently no annotation technique combines throughput and accuracy. As a result, reference gene collections remain incomplete-many gene models are fragmentary, and thousands more remain uncataloged, particularly for long noncoding RNAs (lncRNAs). To accelerate lncRNA annotation, the GENCODE consortium has developed RNA Capture Long Seq (CLS), which combines targeted RNA capture with third-generation long-read sequencing. Here we present an experimental reannotation of the GENCODE intergenic lncRNA populations in matched human and mouse tissues that resulted in novel transcript models for 3,574 and 561 gene loci, respectively. CLS approximately doubled the annotated complexity of targeted loci, outperforming existing short-read techniques. Full-length transcript models produced by CLS enabled us to definitively characterize the genomic features of lncRNAs, including promoter and gene structure, and protein-coding potential. Thus, CLS removes a long-standing bottleneck in transcriptome annotation and generates manual-quality full-length transcript models at high-throughput scales.


Assuntos
Biologia Computacional/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Anotação de Sequência Molecular/métodos , RNA Longo não Codificante/genética , Animais , Perfilação da Expressão Gênica/métodos , Genômica/métodos , Humanos , Camundongos , Fases de Leitura Aberta/genética , Reprodutibilidade dos Testes
18.
Nat Commun ; 7: 12339, 2016 08 17.
Artigo em Inglês | MEDLINE | ID: mdl-27531712

RESUMO

Long non-coding RNAs (lncRNAs) constitute a large, yet mostly uncharacterized fraction of the mammalian transcriptome. Such characterization requires a comprehensive, high-quality annotation of their gene structure and boundaries, which is currently lacking. Here we describe RACE-Seq, an experimental workflow designed to address this based on RACE (rapid amplification of cDNA ends) and long-read RNA sequencing. We apply RACE-Seq to 398 human lncRNA genes in seven tissues, leading to the discovery of 2,556 on-target, novel transcripts. About 60% of the targeted loci are extended in either 5' or 3', often reaching genomic hallmarks of gene boundaries. Analysis of the novel transcripts suggests that lncRNAs are as long, have as many exons and undergo as much alternative splicing as protein-coding genes, contrary to current assumptions. Overall, we show that RACE-Seq is an effective tool to annotate an organism's deep transcriptome, and compares favourably to other targeted sequencing techniques.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala/métodos , Reação em Cadeia da Polimerase/métodos , RNA Longo não Codificante/genética , Análise de Sequência de RNA/métodos , Éxons/genética , Loci Gênicos , Humanos , Anotação de Sequência Molecular , Especificidade de Órgãos/genética , Estudo de Prova de Conceito , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismo , Sítios de Splice de RNA/genética , RNA Longo não Codificante/metabolismo , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Transcriptoma/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...