Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 12 de 12
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
Comput Struct Biotechnol J ; 21: 444-451, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-36618978

RESUMEN

Constructing accurate microbial genome assemblies is necessary to understand genetic diversity in microbial genomes and its functional consequences. However, it still remains as a challenging task especially when only short-read sequencing technologies are used. Here, we present a new read-clustering algorithm, called RBRC, for improving de novo microbial genome assembly, by accurately estimating read proximity using multiple reference genomes. The performance of RBRC was confirmed by simulation-based evaluation in terms of assembly contiguity and the number of misassemblies, and was successfully applied to existing fungal and bacterial genomes by improving the quality of the assemblies without using additional sequencing data. RBRC is a very useful read-clustering algorithm that can be used (i) for generating high-quality genome assemblies of microbial strains when genome assemblies of related strains are available, and (ii) for upgrading existing microbial genome assemblies when the generation of additional sequencing data, such as long reads, is difficult.

2.
BMC Bioinformatics ; 23(1): 383, 2022 Sep 19.
Artículo en Inglés | MEDLINE | ID: mdl-36123620

RESUMEN

BACKGROUND: DNA methylation is an important epigenetic modification that is known to regulate gene expression. Whole-genome bisulfite sequencing (WGBS) is a powerful method for studying cytosine methylation in a whole genome. However, it is difficult to obtain methylation profiles using the WGBS raw reads and is necessary to be proficient in all types of bioinformatic tools for the study of DNA methylation. In addition, recent end-to-end pipelines for DNA methylation analyses are not sufficient for addressing those difficulties. RESULTS: Here we present msPIPE, a pipeline for DNA methylation analyses with WGBS data seamlessly connecting all the required tasks ranging from data pre-processing to multiple downstream DNA methylation analyses. The msPIPE can generate various methylation profiles to analyze methylation patterns in the given sample, including statistical summaries and methylation levels. Also, the methylation levels in the functional regions of a genome are computed with proper annotation. The results of methylation profiles, hypomethylation, and differential methylation analysis are plotted in publication-quality figures. The msPIPE can be easily and conveniently used with a Docker image, which includes all dependent packages and software related to DNA methylation analyses. CONCLUSION: msPIPE is a new end-to-end pipeline designed for methylation calling, profiling, and various types of downstream DNA methylation analyses, leading to the creation of publication-quality figures. msPIPE allows researchers to process and analyze the WGBS data in an easy and convenient way. It is available at https://github.com/jkimlab/msPIPE and https://hub.docker.com/r/jkimlab/mspipe .


Asunto(s)
Citosina , Sulfitos , Análisis de Secuencia de ADN/métodos , Sulfitos/metabolismo , Secuenciación Completa del Genoma/métodos
3.
Gigascience ; 112022 05 17.
Artículo en Inglés | MEDLINE | ID: mdl-35579554

RESUMEN

BACKGROUND: Metagenomic assembly using high-throughput sequencing data is a powerful method to construct microbial genomes in environmental samples without cultivation. However, metagenomic assembly, especially when only short reads are available, is a complex and challenging task because mixed genomes of multiple microorganisms constitute the metagenome. Although long read sequencing technologies have been developed and have begun to be used for metagenomic assembly, many metagenomic studies have been performed based on short reads because the generation of long reads requires higher sequencing cost than short reads. RESULTS: In this study, we present a new method called PLR-GEN. It creates pseudo-long reads from metagenomic short reads based on given reference genome sequences by considering small sequence variations existing in individual genomes of the same or different species. When applied to a mock community data set in the Human Microbiome Project, PLR-GEN dramatically extended short reads in length of 101 bp to pseudo-long reads with N50 of 33 Kbp and 0.4% error rate. The use of these pseudo-long reads generated by PLR-GEN resulted in an obvious improvement of metagenomic assembly in terms of the number of sequences, assembly contiguity, and prediction of species and genes. CONCLUSIONS: PLR-GEN can be used to generate artificial long read sequences without spending extra sequencing cost, thus aiding various studies using metagenomes.


Asunto(s)
Metagenoma , Microbiota , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Metagenómica/métodos , Microbiota/genética , Análisis de Secuencia de ADN/métodos
4.
BMC Bioinformatics ; 21(1): 185, 2020 May 12.
Artículo en Inglés | MEDLINE | ID: mdl-32397982

RESUMEN

BACKGROUND: Microorganisms are important occupants of many different environments. Identifying the composition of microbes and estimating their abundance promote understanding of interactions of microbes in environmental samples. To understand their environments more deeply, the composition of microorganisms in environmental samples has been studied using metagenomes, which are the collections of genomes of the microorganisms. Although many tools have been developed for taxonomy analysis based on different algorithms, variability of analysis outputs of existing tools from the same input metagenome datasets is the main obstacle for many researchers in this field. RESULTS: Here, we present a novel meta-analysis tool for metagenome taxonomy analysis, called TAMA, by intelligently integrating outputs from three different taxonomy analysis tools. Using an integrated reference database, TAMA performs taxonomy assignment for input metagenome reads based on a meta-score by integrating scores of taxonomy assignment from different taxonomy classification tools. TAMA outperformed existing tools when evaluated using various benchmark datasets. It was also successfully applied to obtain relative species abundance profiles and difference in composition of microorganisms in two types of cheese metagenome and human gut metagenome. CONCLUSION: TAMA can be easily installed and used for metagenome read classification and the prediction of relative species abundance from multiple numbers and types of metagenome read samples. TAMA can be used to more accurately uncover the composition of microorganisms in metagenome samples collected from various environments, especially when the use of a single taxonomy analysis tool is unreliable. TAMA is an open source tool, and can be downloaded at https://github.com/jkimlab/TAMA.


Asunto(s)
Bacterias/clasificación , Clasificación/métodos , Metagenoma , Metagenómica/métodos , Bacterias/genética , Bases de Datos Genéticas , Conjuntos de Datos como Asunto , Secuenciación de Nucleótidos de Alto Rendimiento , Modelos Genéticos , Filogenia
5.
Curr Protoc Bioinformatics ; 68(1): e88, 2019 12.
Artículo en Inglés | MEDLINE | ID: mdl-31751498

RESUMEN

INTER-Species Protein Interaction Analysis (INTERSPIA) is a web application for identifying diverse patterns of protein-protein interactions (PPIs) in different species. Given a set of proteins of interest to the user, INTERSPIA first discovers additional proteins that are functionally associated with the input proteins as well as different or common patterns of PPIs among the proteins in multiple species through a server-side pipeline. Second, it visualizes the dynamics of PPIs in multiple species via an easy-to-use web interface. This article contains a basic protocol describing how to visualize diverse patterns of PPIs of input proteins in multiple species, and how to use them for functional analysis in the web interface. INTERSPIA is freely available at http://bioinfo.konkuk.ac.kr/INTERSPIA/. © 2019 by John Wiley & Sons, Inc. Basic Protocol: Running INTERSPIA using a list of input proteins.


Asunto(s)
Mapeo de Interacción de Proteínas/métodos , Programas Informáticos , Animales , Bases de Datos de Proteínas , Humanos , Internet , Especificidad de la Especie , Interfaz Usuario-Computador
6.
BMC Bioinformatics ; 19(1): 216, 2018 06 05.
Artículo en Inglés | MEDLINE | ID: mdl-29871588

RESUMEN

BACKGROUND: Advances in sequencing technologies have facilitated large-scale comparative genomics based on whole genome sequencing. Constructing and investigating conserved genomic regions among multiple species (called synteny blocks) are essential in the comparative genomics. However, they require significant amounts of computational resources and time in addition to bioinformatics skills. Many web interfaces have been developed to make such tasks easier. However, these web interfaces cannot be customized for users who want to use their own set of genome sequences or definition of synteny blocks. RESULTS: To resolve this limitation, we present mySyntenyPortal, a stand-alone application package to construct websites for synteny block analyses by using users' own genome data. mySyntenyPortal provides both command line and web-based interfaces to build and manage websites for large-scale comparative genomic analyses. The websites can be also easily published and accessed by other users. To demonstrate the usability of mySyntenyPortal, we present an example study for building websites to compare genomes of three mammalian species (human, mouse, and cow) and show how they can be easily utilized to identify potential genes affected by genome rearrangements. CONCLUSIONS: mySyntenyPortal will contribute for extended comparative genomic analyses based on large-scale whole genome sequences by providing unique functionality to support the easy creation of interactive websites for synteny block analyses from user's own genome data.


Asunto(s)
Genómica/métodos , Programas Informáticos , Sintenía , Animales , Bovinos , Femenino , Genoma , Humanos , Internet , Ratones , Secuenciación Completa del Genoma
7.
Nucleic Acids Res ; 46(W1): W89-W94, 2018 07 02.
Artículo en Inglés | MEDLINE | ID: mdl-29746660

RESUMEN

Proteins perform biological functions through cascading interactions with each other by forming protein complexes. As a result, interactions among proteins, called protein-protein interactions (PPIs) are not completely free from selection constraint during evolution. Therefore, the identification and analysis of PPI changes during evolution can give us new insight into the evolution of functions. Although many algorithms, databases and websites have been developed to help the study of PPIs, most of them are limited to visualize the structure and features of PPIs in a chosen single species with limited functions in the visualization perspective. This leads to difficulties in the identification of different patterns of PPIs in different species and their functional consequences. To resolve these issues, we developed a web application, called INTER-Species Protein Interaction Analysis (INTERSPIA). Given a set of proteins of user's interest, INTERSPIA first discovers additional proteins that are functionally associated with the input proteins and searches for different patterns of PPIs in multiple species through a server-side pipeline, and second visualizes the dynamics of PPIs in multiple species using an easy-to-use web interface. INTERSPIA is freely available at http://bioinfo.konkuk.ac.kr/INTERSPIA/.


Asunto(s)
Biología Computacional , Internet , Mapeo de Interacción de Proteínas/métodos , Programas Informáticos , Algoritmos , Bases de Datos de Proteínas , Proteínas/química , Proteínas/genética , Interfaz Usuario-Computador
8.
Sci Rep ; 7(1): 17303, 2017 12 11.
Artículo en Inglés | MEDLINE | ID: mdl-29230066

RESUMEN

Rapid and cost effective production of large-scale genome data through next-generation sequencing has enabled population-level studies of various organisms to identify their genotypic differences and phenotypic consequences. This is also used to study indigenous animals with historical and economical values, although they are less studied than model organisms. The objective of this study was to perform functional and evolutionary analysis of Korean bob-tailed native dog Donggyeong with distinct tail and agility phenotype using whole-genome sequencing data by using population and comparative genomics approaches. Based on the uniqueness of non-synonymous single nucleotide polymorphisms obtained from next-generation sequencing data, Donggyeong dog-specific genes/proteins and their functions were identified by comparison with 12 other dog breeds and six other related species. These proteins were further divided into subpopulation-specific ones with different tail length and protein interaction-level signatures were investigated. Finally, the trajectory of shaping protein interactions of subpopulation-specific proteins during evolution was uncovered. This study expands our knowledge of Korean native dogs. Our results also provide a good example of using whole-genome sequencing data for population-level analysis in closely related species.


Asunto(s)
Biomarcadores/metabolismo , Evolución Molecular , Genoma , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Polimorfismo de Nucleótido Simple , Cola (estructura animal)/fisiología , Secuenciación Completa del Genoma/métodos , Animales , Perros , Genotipo , Fenotipo , Filogenia , Mapas de Interacción de Proteínas , Cola (estructura animal)/anatomía & histología
9.
Mol Phylogenet Evol ; 107: 90-102, 2017 02.
Artículo en Inglés | MEDLINE | ID: mdl-27746318

RESUMEN

Plectida is an important nematode order with species that occupy many different biological niches. The order includes free-living aquatic and soil-dwelling species, but its phylogenetic position has remained uncertain. We sequenced the complete mitochondrial genomes of two members of this order, Plectus acuminatus and Plectus aquatilis and compared them with those of other major nematode clades. The genome size and base composition of these species are similar to other nematodes; 14,831 and 14,372bp, respectively, with AT contents of 71.0% and 70.1%. Gene content was also similar to other nematodes, but gene order and coding direction of Plectus mtDNAs were dissimilar from other chromadorean species. P. acuminatus and P. aquatilis are the first chromadorean species found to contain a gene inversion. We reconstructed mitochondrial genome phylogenetic trees using nucleotide and amino acid datasets from 87 nematodes that represent major nematode clades, including the Plectus sequences. Trees from phylogenetic analyses using maximum likelihood and Bayesian methods depicted Plectida as the sister group to other sequenced chromadorean nematodes. This finding is consistent with several phylogenetic results based on SSU rDNA, but disagrees with a classification based on morphology. Mitogenomes representing other basal chromadorean groups (Araeolaimida, Monhysterida, Desmodorida, Chromadorida) are needed to confirm their phylogenetic relationships.


Asunto(s)
Genoma Mitocondrial , Nematodos/clasificación , Rabdítidos/clasificación , Animales , Teorema de Bayes , Evolución Biológica , ADN/química , ADN/aislamiento & purificación , ADN/metabolismo , ADN Mitocondrial/química , ADN Mitocondrial/clasificación , ADN Mitocondrial/genética , Nematodos/genética , Filogenia , Rabdítidos/genética
10.
Nucleic Acids Res ; 44(W1): W35-40, 2016 Jul 08.
Artículo en Inglés | MEDLINE | ID: mdl-27154270

RESUMEN

Recent advances in next-generation sequencing technologies and genome assembly algorithms have enabled the accumulation of a huge volume of genome sequences from various species. This has provided new opportunities for large-scale comparative genomics studies. Identifying and utilizing synteny blocks, which are genomic regions conserved among multiple species, is key to understanding genomic architecture and the evolutionary history of genomes. However, the construction and visualization of such synteny blocks from multiple species are very challenging, especially for biologists with a lack of computational skills. Here, we present Synteny Portal, a versatile web-based application portal for constructing, visualizing and browsing synteny blocks. With Synteny Portal, users can easily (i) construct synteny blocks among multiple species by using prebuilt alignments in the UCSC genome browser database, (ii) visualize and download syntenic relationships as high-quality images, (iii) browse synteny blocks with genetic information and (iv) download the details of synteny blocks to be used as input for downstream synteny-based analyses, all in an intuitive and easy-to-use web-based interface. We believe that Synteny Portal will serve as a highly valuable tool that will enable biologists to easily perform comparative genomics studies by compensating limitations of existing tools. Synteny Portal is freely available at http://bioinfo.konkuk.ac.kr/synteny_portal.


Asunto(s)
Cromosomas de los Mamíferos/química , Genoma , Sintenía , Interfaz Usuario-Computador , Algoritmos , Animales , Secuencia de Bases , Bovinos , Mapeo Cromosómico , Gráficos por Computador , Humanos , Internet , Ratones
11.
Mol Biol Evol ; 32(11): 2803-17, 2015 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-26337547

RESUMEN

In humans, numerous genes encode neuropeptides that comprise a superfamily of more than 70 genes in approximately 30 families and act mainly through rhodopsin-like G protein-coupled receptors (GPCRs). Two rounds of whole-genome duplication (2R WGD) during early vertebrate evolution greatly contributed to proliferation within gene families; however, the mechanisms underlying the initial emergence and diversification of these gene families before 2R WGD are largely unknown. In this study, we analyzed 25 vertebrate rhodopsin-like neuropeptide GPCR families and their cognate peptides using phylogeny, synteny, and localization of these genes on reconstructed vertebrate ancestral chromosomes (VACs). Based on phylogeny, these GPCR families can be divided into five distinct clades, and members of each clade tend to be located on the same VACs. Similarly, their neuropeptide gene families also tend to reside on distinct VACs. Comparison of these GPCR genes with those of invertebrates including Drosophila melanogaster, Caenorhabditis elegans, Branchiostoma floridae, and Ciona intestinalis indicates that these GPCR families emerged through tandem local duplication during metazoan evolution prior to 2R WGD. Our study describes a presumptive evolutionary mechanism and development pathway of the vertebrate rhodopsin-like GPCR and cognate neuropeptide families from the urbilaterian ancestor to modern vertebrates.


Asunto(s)
Evolución Molecular , Receptores Acoplados a Proteínas G/genética , Animales , Secuencia Conservada , Duplicación de Gen , Genoma , Humanos , Invertebrados , Neuropéptidos/genética , Filogenia , Rodopsina/genética , Sintenía , Vertebrados/genética
12.
J Microbiol Methods ; 109: 180-7, 2015 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-25572018

RESUMEN

The study of environmental microbial communities, called metagenomics, has gained a lot of attention because of the recent advances in next-generation sequencing (NGS) technologies. Microbes play a critical role in changing their environments, and the mode of their effect can be solved by investigating metagenomes. However, the difficulty of metagenomes, such as the combination of multiple microbes and different species abundance, makes metagenome assembly tasks more challenging. In this paper, we developed a new metagenome assembly method by utilizing protein sequences, in addition to the NGS read sequences. Our method (i) builds read clusters by using mapping information against available protein sequences, and (ii) creates contig sequences by finding consensus sequences through probabilistic choices from the read clusters. By using simulated NGS read sequences from real microbial genome sequences, we evaluated our method in comparison with four existing assembly programs. We found that our method could generate relatively long and accurate metagenome assemblies, indicating that the idea of using protein sequences, as a guide for the assembly, is promising.


Asunto(s)
Análisis por Conglomerados , Metagenómica/métodos , Proteínas/genética , Biología Computacional/métodos , Microbiología Ambiental
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...