Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
1.
Bioinformatics ; 34(10): 1621-1628, 2018 05 15.
Artículo en Inglés | MEDLINE | ID: mdl-29281000

RESUMEN

Motivation: Although the amount of small non-coding RNA-sequencing data is continuously increasing, it is still unclear to which extent small RNAs are represented in the human genome. Results: In this study we analyzed 303 billion sequencing reads from nearly 25 000 datasets to answer this question. We determined that 0.8% of the human genome are reliably covered by 874 123 regions with an average length of 31 nt. On the basis of these regions, we found that among the known small non-coding RNA classes, microRNAs were the most prevalent. In subsequent steps, we characterized variations of miRNAs and performed a staged validation of 11 877 candidate miRNAs. Of these, many were actually expressed and significantly dysregulated in lung cancer. Selected candidates were finally validated by northern blots. Although isolated miRNAs could still be present in the human genome, our presented set likely contains the largest fraction of human miRNAs. Contact: c.backes@mx.uni-saarland.de or andreas.keller@ccb.uni-saarland.de. Supplementary information: Supplementary data are available at Bioinformatics online.


Asunto(s)
Genoma Humano , MicroARNs , Análisis de Secuencia de ADN , Transcriptoma , Genómica , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Neoplasias Pulmonares/genética , Polimorfismo de Nucleótido Simple , Análisis de Secuencia de ARN
2.
Bioinformatics ; 33(7): 988-996, 2017 04 01.
Artículo en Inglés | MEDLINE | ID: mdl-27993777

RESUMEN

Motivation: The aim of this study is to assess the performance of RNA-RNA interaction prediction tools for all domains of life. Results: Minimum free energy (MFE) and alignment methods constitute most of the current RNA interaction prediction algorithms. The MFE tools that include accessibility (i.e. RNAup, IntaRNA and RNAplex) to the final predicted binding energy have better true positive rates (TPRs) with a high positive predictive values (PPVs) in all datasets than other methods. They can also differentiate almost half of the native interactions from background. The algorithms that include effects of internal binding energies to their model and alignment methods seem to have high TPR but relatively low associated PPV compared to accessibility based methods. Availability and Implementation: We shared our wrapper scripts and datasets at Github (github.com/UCanCompBio/RNA_Interactions_Benchmark). All parameters are documented for personal use. Contact: sinan.umu@pg.canterbury.ac.nz. Supplementary information: Supplementary data are available at Bioinformatics online.


Asunto(s)
Algoritmos , Benchmarking , ARN/metabolismo , Bacterias/genética , Bases de Datos Genéticas , Modelos Teóricos , ARN/química , Análisis de Secuencia de ARN
3.
RNA Biol ; 15(2): 242-250, 2018 02 01.
Artículo en Inglés | MEDLINE | ID: mdl-29219730

RESUMEN

Non-coding RNA (ncRNA) molecules have fundamental roles in cells and many are also stable in body fluids as extracellular RNAs. In this study, we used RNA sequencing (RNA-seq) to investigate the profile of small non-coding RNA (sncRNA) in human serum. We analyzed 10 billion Illumina reads from 477 serum samples, included in the Norwegian population-based Janus Serum Bank (JSB). We found that the core serum RNA repertoire includes 258 micro RNAs (miRNA), 441 piwi-interacting RNAs (piRNA), 411 transfer RNAs (tRNA), 24 small nucleolar RNAs (snoRNA), 125 small nuclear RNAs (snRNA) and 123 miscellaneous RNAs (misc-RNA). We also investigated biological and technical variation in expression, and the results suggest that many RNA molecules identified in serum contain signs of biological variation. They are therefore unlikely to be random degradation by-products. In addition, the presence of specific fragments of tRNA, snoRNA, Vault RNA and Y_RNA indicates protection from degradation. Our results suggest that many circulating RNAs in serum can be potential biomarkers.


Asunto(s)
Ácidos Nucleicos Libres de Células/sangre , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Análisis de Secuencia de ARN/métodos , Biomarcadores/sangre , Biomarcadores/química , Ácidos Nucleicos Libres de Células/química , Regulación de la Expresión Génica , Humanos , MicroARNs/sangre , MicroARNs/química , Estabilidad del ARN , ARN Interferente Pequeño/sangre , ARN Interferente Pequeño/química , ARN Nucleolar Pequeño/sangre , ARN Nucleolar Pequeño/química , ARN Pequeño no Traducido/sangre , ARN Pequeño no Traducido/química , ARN de Transferencia/sangre , ARN de Transferencia/química
4.
PLoS Comput Biol ; 10(10): e1003907, 2014 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-25357249

RESUMEN

Noncoding RNAs are integral to a wide range of biological processes, including translation, gene regulation, host-pathogen interactions and environmental sensing. While genomics is now a mature field, our capacity to identify noncoding RNA elements in bacterial and archaeal genomes is hampered by the difficulty of de novo identification. The emergence of new technologies for characterizing transcriptome outputs, notably RNA-seq, are improving noncoding RNA identification and expression quantification. However, a major challenge is to robustly distinguish functional outputs from transcriptional noise. To establish whether annotation of existing transcriptome data has effectively captured all functional outputs, we analysed over 400 publicly available RNA-seq datasets spanning 37 different Archaea and Bacteria. Using comparative tools, we identify close to a thousand highly-expressed candidate noncoding RNAs. However, our analyses reveal that capacity to identify noncoding RNA outputs is strongly dependent on phylogenetic sampling. Surprisingly, and in stark contrast to protein-coding genes, the phylogenetic window for effective use of comparative methods is perversely narrow: aggregating public datasets only produced one phylogenetic cluster where these tools could be used to robustly separate unannotated noncoding RNAs from a null hypothesis of transcriptional noise. Our results show that for the full potential of transcriptomics data to be realized, a change in experimental design is paramount: effective transcriptomics requires phylogeny-aware sampling.


Asunto(s)
Perfilación de la Expresión Génica/métodos , ARN no Traducido/clasificación , ARN no Traducido/genética , Transcriptoma/genética , Archaea/genética , Bacterias/genética , Análisis por Conglomerados , Biología Computacional , Bases de Datos Genéticas , Filogenia , ARN de Archaea/química , ARN de Archaea/clasificación , ARN de Archaea/genética , ARN Bacteriano/química , ARN Bacteriano/clasificación , ARN Bacteriano/genética , ARN no Traducido/química
5.
Cell Genom ; 3(8): 100348, 2023 Aug 09.
Artículo en Inglés | MEDLINE | ID: mdl-37601971

RESUMEN

The annotation of microRNAs depends on the availability of transcriptomics data and expert knowledge. This has led to a gap between the availability of novel genomes and high-quality microRNA complements. Using >16,000 microRNAs from the manually curated microRNA gene database MirGeneDB, we generated trained covariance models for all conserved microRNA families. These models are available in our tool MirMachine, which annotates conserved microRNAs within genomes. We successfully applied MirMachine to a range of animal species, including those with large genomes and genome duplications and extinct species, where small RNA sequencing is hard to achieve. We further describe a microRNA score of expected microRNAs that can be used to assess the completeness of genome assemblies. MirMachine closes a long-persisting gap in the microRNA field by facilitating automated genome annotation pipelines and deeper studies into the evolution of genome regulation, even in extinct organisms.

6.
Tumour Virus Res ; 12: 200221, 2021 12.
Artículo en Inglés | MEDLINE | ID: mdl-34175494

RESUMEN

Human papillomavirus (HPV) 16 and 18 are the most predominant types in cervical cancer. Only a small fraction of HPV infections progress to cancer, indicating that additional factors and genomic events contribute to the carcinogenesis, such as minor nucleotide variation caused by APOBEC3 and chromosomal integration. We analysed intra-host minor nucleotide variants (MNVs) and integration in HPV16 and HPV18 positive cervical samples with different morphology. Samples were sequenced using an HPV whole genome sequencing protocol TaME-seq. A total of 80 HPV16 and 51 HPV18 positive samples passed the sequencing depth criteria of 300× reads, showing the following distribution: non-progressive disease (HPV16 n = 21, HPV18 n = 12); cervical intraepithelial neoplasia (CIN) grade 2 (HPV16 n = 27, HPV18 n = 9); CIN3/adenocarcinoma in situ (AIS) (HPV16 n = 27, HPV18 n = 30); cervical cancer (HPV16 n = 5). Similar numbers of MNVs in HPV16 and HPV18 samples were observed for most viral genes, with the exception of HPV18 E4 with higher numbers across clinical categories. APOBEC3 signatures were observed in HPV16 lesions, while similar mutation patterns were not detected for HPV18. The proportion of samples with integration was 13% for HPV16 and 59% for HPV18 positive samples, with a noticeable portion located within or close to cancer-related genes.


Asunto(s)
Desaminasas APOBEC/genética , Infecciones por Papillomavirus , Displasia del Cuello del Útero , Neoplasias del Cuello Uterino , Cuello del Útero , Femenino , Papillomavirus Humano 16 , Papillomavirus Humano 18 , Humanos , Infecciones por Papillomavirus/diagnóstico , Neoplasias del Cuello Uterino/diagnóstico , Neoplasias del Cuello Uterino/virología , Displasia del Cuello del Útero/diagnóstico , Displasia del Cuello del Útero/virología
7.
Mol Oncol ; 14(2): 235-247, 2020 02.
Artículo en Inglés | MEDLINE | ID: mdl-31851411

RESUMEN

The majority of lung cancer (LC) patients are diagnosed at a late stage, and survival is poor. Circulating RNA molecules are known to have a role in cancer; however, their involvement before diagnosis remains an open question. In this study, we investigated circulating RNA dynamics in prediagnostic LC samples, focusing on smokers, to identify if and when disease-related signals can be detected in serum. We sequenced small RNAs in 542 serum LC samples donated up to 10 years before diagnosis and 519 matched cancer-free controls coming from 905 individuals in the Janus Serum Bank. This sample size provided sufficient statistical power to independently analyze time to diagnosis, stage, and histology. The results showed dynamic changes in differentially expressed circulating RNAs specific to LC histology and stage. The greatest number of differentially expressed RNAs was identified around 7 years before diagnosis for early-stage LC and 1-4 years prior to diagnosis for locally advanced and advanced-stage LC, regardless of LC histology. Furthermore, NSCLC and SCLC histologies have distinct prediagnostic signals. The majority of differentially expressed RNAs were associated with cancer-related pathways. The dynamic RNA signals pinpointed different phases of tumor development over time. Stage-specific RNA profiles may be associated with tumor aggressiveness. Our results improve the molecular understanding of carcinogenesis. They indicate substantial opportunity for screening and improved treatment and will guide further research on early detection of LC. However, the dynamic nature of the RNA signals also suggests challenges for prediagnostic biomarker discovery.


Asunto(s)
Biomarcadores de Tumor/sangre , Carcinogénesis/genética , Ácidos Nucleicos Libres de Células/sangre , Neoplasias Pulmonares/sangre , Neoplasias Pulmonares/diagnóstico , ARN Pequeño no Traducido/sangre , Adulto , Biomarcadores de Tumor/genética , Bancos de Sangre , Carcinoma de Pulmón de Células no Pequeñas/sangre , Carcinoma de Pulmón de Células no Pequeñas/genética , Estudios de Casos y Controles , Femenino , Estudios de Seguimiento , Regulación Neoplásica de la Expresión Génica/genética , Humanos , Neoplasias Pulmonares/genética , Masculino , Persona de Mediana Edad , ARN Pequeño no Traducido/genética , RNA-Seq , Factores de Tiempo
8.
Sci Rep ; 9(1): 524, 2019 01 24.
Artículo en Inglés | MEDLINE | ID: mdl-30679491

RESUMEN

HPV genomic variability and chromosomal integration are important in the HPV-induced carcinogenic process. To uncover these genomic events in an HPV infection, we have developed an innovative and cost-effective sequencing approach named TaME-seq (tagmentation-assisted multiplex PCR enrichment sequencing). TaME-seq combines tagmentation and multiplex PCR enrichment for simultaneous analysis of HPV variation and chromosomal integration, and it can also be adapted to other viruses. For method validation, cell lines (n = 4), plasmids (n = 3), and HPV16, 18, 31, 33 and 45 positive clinical samples (n = 21) were analysed. Our results showed deep HPV genome-wide sequencing coverage. Chromosomal integration breakpoints and large deletions were identified in HPV positive cell lines and in one clinical sample. HPV genomic variability was observed in all samples allowing identification of low frequency variants. In contrast to other approaches, TaME-seq proved to be highly efficient in HPV target enrichment, leading to reduced sequencing costs. Comprehensive studies on HPV intra-host variability generated during a persistent infection will improve our understanding of viral carcinogenesis. Efficient identification of both HPV variability and integration sites will be important for the study of HPV evolution and adaptability and may be an important tool for use in cervical cancer diagnostics.


Asunto(s)
Alphapapillomavirus/genética , Reacción en Cadena de la Polimerasa Multiplex/métodos , Infecciones por Papillomavirus/virología , Alphapapillomavirus/fisiología , Puntos de Rotura del Cromosoma , Femenino , Variación Genética , Genoma Viral , Papillomavirus Humano 16/genética , Papillomavirus Humano 16/fisiología , Papillomavirus Humano 18/genética , Papillomavirus Humano 18/fisiología , Humanos , Infecciones por Papillomavirus/genética , Neoplasias del Cuello Uterino/genética , Neoplasias del Cuello Uterino/virología , Integración Viral
9.
Elife ; 52016 09 20.
Artículo en Inglés | MEDLINE | ID: mdl-27642845

RESUMEN

A critical assumption of gene expression analysis is that mRNA abundances broadly correlate with protein abundance, but these two are often imperfectly correlated. Some of the discrepancy can be accounted for by two important mRNA features: codon usage and mRNA secondary structure. We present a new global factor, called mRNA:ncRNA avoidance, and provide evidence that avoidance increases translational efficiency. We also demonstrate a strong selection for the avoidance of stochastic mRNA:ncRNA interactions across prokaryotes, and that these have a greater impact on protein abundance than mRNA structure or codon usage. By generating synonymously variant green fluorescent protein (GFP) mRNAs with different potential for mRNA:ncRNA interactions, we demonstrate that GFP levels correlate well with interaction avoidance. Therefore, taking stochastic mRNA:ncRNA interactions into account enables precise modulation of protein abundance.


Asunto(s)
Archaea/genética , Archaea/metabolismo , Bacterias/genética , Bacterias/metabolismo , Biosíntesis de Proteínas , ARN Mensajero/metabolismo , ARN no Traducido/metabolismo , Genes Reporteros , Proteínas Fluorescentes Verdes/análisis , Proteínas Fluorescentes Verdes/genética
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA