Búsqueda | Portal de Búsqueda de la BVS Colombia

Global Analysis of the Human RNA Degradome Reveals Widespread Decapped and Endonucleolytic Cleaved Transcripts.

Won, Jung-Im; Shin, JaeMoon; Park, So Young; Yoon, JeeHee; Jeong, Dong-Hoon.

Int J Mol Sci ; 21(18)2020 Sep 04.

Artículo en Inglés | MEDLINE | ID: mdl-32899599

RESUMEN

RNA decay is an important regulatory mechanism for gene expression at the posttranscriptional level. Although the main pathways and major enzymes that facilitate this process are well defined, global analysis of RNA turnover remains under-investigated. Recent advances in the application of next-generation sequencing technology enable its use in order to examine various RNA decay patterns at the genome-wide scale. In this study, we investigated human RNA decay patterns using parallel analysis of RNA end-sequencing (PARE-seq) data from XRN1-knockdown HeLa cell lines, followed by a comparison of steady state and degraded mRNA levels from RNA-seq and PARE-seq data, respectively. The results revealed 1103 and 1347 transcripts classified as stable and unstable candidates, respectively. Of the unstable candidates, we found that a subset of the replication-dependent histone transcripts was polyadenylated and rapidly degraded. Additionally, we identified 380 endonucleolytically cleaved candidates by analyzing the most abundant PARE sequence on a transcript. Of these, 41.4% of genes were classified as unstable genes, which implied that their endonucleolytic cleavage might affect their mRNA stability. Furthermore, we identified 1877 decapped candidates, including HSP90B1 and SWI5, having the most abundant PARE sequences at the 5'-end positions of the transcripts. These results provide a useful resource for further analysis of RNA decay patterns in human cells.

Asunto(s)

Regulación de la Expresión Génica/genética , Estabilidad del ARN/fisiología , Secuencia de Bases/genética , Bases de Datos Genéticas , Genoma/genética , Células HeLa , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Histonas/metabolismo , Humanos , ARN Mensajero/genética , Análisis de Secuencia de ARN/métodos , Secuenciación Completa del Genoma/métodos

Draft Genome of Toxocara canis, a Pathogen Responsible for Visceral Larva Migrans.

Kong, Jinhwa; Won, Jungim; Yoon, Jeehee; Lee, UnJoo; Kim, Jong-Il; Huh, Sun.

Korean J Parasitol ; 54(6): 751-758, 2016 Dec.

Artículo en Inglés | MEDLINE | ID: mdl-28095660

RESUMEN

This study aimed at constructing a draft genome of the adult female worm Toxocara canis using next-generation sequencing (NGS) and de novo assembly, as well as to find new genes after annotation using functional genomics tools. Using an NGS machine, we produced DNA read data of T. canis. The de novo assembly of the read data was performed using SOAPdenovo. RNA read data were assembled using Trinity. Structural annotation, homology search, functional annotation, classification of protein domains, and KEGG pathway analysis were carried out. Besides them, recently developed tools such as MAKER, PASA, Evidence Modeler, and Blast2GO were used. The scaffold DNA was obtained, the N50 was 108,950 bp, and the overall length was 341,776,187 bp. The N50 of the transcriptome was 940 bp, and its length was 53,046,952 bp. The GC content of the entire genome was 39.3%. The total number of genes was 20,178, and the total number of protein sequences was 22,358. Of the 22,358 protein sequences, 4,992 were newly observed in T. canis. Following proteins previously unknown were found: E3 ubiquitin-protein ligase cbl-b and antigen T-cell receptor, zeta chain for T-cell and B-cell regulation; endoprotease bli-4 for cuticle metabolism; mucin 12Ea and polymorphic mucin variant C6/1/40r2.1 for mucin production; tropomodulin-family protein and ryanodine receptor calcium release channels for muscle movement. We were able to find new hypothetical polypeptides sequences unique to T. canis, and the findings of this study are capable of serving as a basis for extending our biological understanding of T. canis.

Asunto(s)

Genoma de los Helmintos , Toxocara canis/genética , Animales , Composición de Base , Biología Computacional , ADN de Helmintos/química , ADN de Helmintos/genética , Femenino , Genes de Helminto , Proteínas del Helminto/genética , Secuenciación de Nucleótidos de Alto Rendimiento , Anotación de Secuencia Molecular , Análisis de Secuencia de ADN , Toxocara canis/aislamiento & purificación

RelCurator: a text mining-based curation system for extracting gene-phenotype relationships specific to neurodegenerative disorders.

Lee, Heonwoo; Jeon, Junbeom; Jung, Dawoon; Won, Jung-Im; Kim, Kiyong; Kim, Yun Joong; Yoon, Jeehee.

Genes Genomics ; 45(8): 1025-1036, 2023 Aug.

Artículo en Inglés | MEDLINE | ID: mdl-37300788

RESUMEN

BACKGROUND: The identification of gene-phenotype relationships is important in medical genetics as it serves as a basis for precision medicine. However, most of the gene-phenotype relationship data are buried in the biomedical literature in textual form. OBJECTIVE: We propose RelCurator, a curation system that extracts sentences including both gene and phenotype entities related to specific disease categories from PubMed articles, provides rich additional information such as entity taggings, and predictions of gene-phenotype relationships. METHODS: We targeted neurodegenerative disorders and developed a deep learning model using Bidirectional Gated Recurrent Unit (BiGRU) networks and BioWordVec word embeddings for predicting gene-phenotype relationships from biomedical texts. The prediction model is trained with more than 130,000 labeled PubMed sentences including gene and phenotype entities, which are related to or unrelated to neurodegenerative disorders. RESULTS: We compared the performance of our deep learning model with those of Bidirectional Encoder Representations from Transformers (BERT), Support Vector Machine (SVM), and simple Recurrent Neural Network (simple RNN) models. Our model performed better with an F1-score of 0.96. Furthermore, the evaluation done using a few curation cases in the real scenario showed the effectiveness of our work. Therefore, we conclude that RelCurator can identify not only new causative genes, but also new genes associated with neurodegenerative disorders' phenotype. CONCLUSION: RelCurator is a user-friendly method for accessing deep learning-based supporting information and a concise web interface to assist curators while browsing the PubMed articles. Our curation process represents an important and broadly applicable improvement to the state of the art for the curation of gene-phenotype relationships.

Asunto(s)

Minería de Datos , Enfermedades Neurodegenerativas , Humanos , Minería de Datos/métodos , Redes Neurales de la Computación , Enfermedades Neurodegenerativas/genética

GAAP: A Genome Assembly + Annotation Pipeline.

Kong, Jinhwa; Huh, Sun; Won, Jung-Im; Yoon, Jeehee; Kim, Baeksop; Kim, Kiyong.

Biomed Res Int ; 2019: 4767354, 2019.

Artículo en Inglés | MEDLINE | ID: mdl-31346518

RESUMEN

Genomic analysis begins with de novo assembly of short-read fragments in order to reconstruct full-length base sequences without exploiting a reference genome sequence. Then, in the annotation step, gene locations are identified within the base sequences, and the structures and functions of these genes are determined. Recently, a wide range of powerful tools have been developed and published for whole-genome analysis, enabling even individual researchers in small laboratories to perform whole-genome analyses on their objects of interest. However, these analytical tools are generally complex and use diverse algorithms, parameter setting methods, and input formats; thus, it remains difficult for individual researchers to select, utilize, and combine these tools to obtain their final results. To resolve these issues, we have developed a genome analysis pipeline (GAAP) for semiautomated, iterative, and high-throughput analysis of whole-genome data. This pipeline is designed to perform read correction, de novo genome (transcriptome) assembly, gene prediction, and functional annotation using a range of proven tools and databases. We aim to assist non-IT researchers by describing each stage of analysis in detail and discussing current approaches. We also provide practical advice on how to access and use the bioinformatics tools and databases and how to implement the provided suggestions. Whole-genome analysis of Toxocara canis is used as case study to show intermediate results at each stage, demonstrating the practicality of the proposed method.

Asunto(s)

Bases de Datos de Ácidos Nucleicos , Genoma de los Helmintos , Anotación de Secuencia Molecular , Toxocara canis/genética , Secuenciación Completa del Genoma , Animales , Genómica

ExCNVSS: A Noise-Robust Method for Copy Number Variation Detection in Whole Exome Sequencing Data.

Kong, Jinhwa; Shin, Jaemoon; Won, Jungim; Lee, Keonbae; Lee, Unjoo; Yoon, Jeehee.

Biomed Res Int ; 2017: 9631282, 2017.

Artículo en Inglés | MEDLINE | ID: mdl-28698882

RESUMEN

Copy number variations (CNVs) are structural variants associated with human diseases. Recent studies verified that disease-related genes are based on the extraction of rare de novo and transmitted CNVs from exome sequencing data. The need for more efficient and accurate methods has increased, which still remains a challenging problem due to coverage biases, as well as the sparse, small-sized, and noncontinuous nature of exome sequencing. In this study, we developed a new CNV detection method, ExCNVSS, based on read coverage depth evaluation and scale-space filtering to resolve these problems. We also developed the method ExCNVSS_noRatio, which is a version of ExCNVSS, for applying to cases with an input of test data only without the need to consider the availability of a matched control. To evaluate the performance of our method, we tested it with 11 different simulated data sets and 10 real HapMap samples' data. The results demonstrated that ExCNVSS outperformed three other state-of-the-art methods and that our method corrected for coverage biases and detected all-sized CNVs even without matched control data.

Asunto(s)

Variaciones en el Número de Copia de ADN , Exoma , Secuenciación de Nucleótidos de Alto Rendimiento , Modelos Genéticos

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA