Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
Más filtros

Banco de datos
Tipo del documento
Publication year range
1.
BMC Bioinformatics ; 25(1): 241, 2024 Jul 16.
Artículo en Inglés | MEDLINE | ID: mdl-39014300

RESUMEN

BACKGROUND: Using next-generation sequencing technologies, scientists can sequence complex microbial communities directly from the environment. Significant insights into the structure, diversity, and ecology of microbial communities have resulted from the study of metagenomics. The assembly of reads into longer contigs, which are then binned into groups of contigs that correspond to different species in the metagenomic sample, is a crucial step in the analysis of metagenomics. It is necessary to organize these contigs into operational taxonomic units (OTUs) for further taxonomic profiling and functional analysis. For binning, which is synonymous with the clustering of OTUs, the tetra-nucleotide frequency (TNF) is typically utilized as a compositional feature for each OTU. RESULTS: In this paper, we present AFIT, a new l-mer statistic vector for each contig, and AFITBin, a novel method for metagenomic binning based on AFIT and a matrix factorization method. To evaluate the performance of the AFIT vector, the t-SNE algorithm is used to compare species clustering based on AFIT and TNF information. In addition, the efficacy of AFITBin is demonstrated on both simulated and real datasets in comparison to state-of-the-art binning methods such as MetaBAT 2, MaxBin 2.0, CONCOT, MetaCon, SolidBin, BusyBee Web, and MetaBinner. To further analyze the performance of the purposed AFIT vector, we compare the barcodes of the AFIT vector and the TNF vector. CONCLUSION: The results demonstrate that AFITBin shows superior performance in taxonomic identification compared to existing methods, leveraging the AFIT vector for improved results in metagenomic binning. This approach holds promise for advancing the analysis of metagenomic data, providing more reliable insights into microbial community composition and function. AVAILABILITY: A python package is available at: https://github.com/SayehSobhani/AFITBin .


Asunto(s)
Algoritmos , Metagenómica , Metagenómica/métodos , Nucleótidos/genética , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Programas Informáticos , Microbiota/genética , Análisis de Secuencia de ADN/métodos , Análisis por Conglomerados , Mapeo Contig/métodos , Metagenoma/genética
2.
Genomics ; 110(5): 263-273, 2018 09.
Artículo en Inglés | MEDLINE | ID: mdl-29180261

RESUMEN

Several proteins and genes are members of families that share a public evolutionary. In order to outline the evolutionary relationships and to recognize conserved patterns, sequence comparison becomes an emerging process. The current work investigates critically the k-mer role in composition vector method for comparing genome sequences. Generally, composition vector methods using k-mer are applied under choice of different value of k to compare genome sequences. For some values of k, results are satisfactory, but for other values of k, results are unsatisfactory. Standard composition vector method is carried out in the proposed work using 3-mer string length. In addition, special type of information based similarity index is used as a distance measure. It establishes that use of 3-mer and information based similarity index provide satisfactory results especially for comparison of whole genome sequences in all cases. These selections provide a sort of unified approach towards comparison of genome sequences.


Asunto(s)
Algoritmos , Genómica/métodos , Alineación de Secuencia/métodos , Animales , Humanos , Alineación de Secuencia/normas
3.
Toxins (Basel) ; 15(6)2023 06 12.
Artículo en Inglés | MEDLINE | ID: mdl-37368694

RESUMEN

An automated method was developed for differentiating closely related B. cereus sensu lato (s.l.) species, especially biopesticide Bacillus thuringiensis, from other human pathogens, B. anthracis and B. cereus sensu stricto (s.s.). In the current research, four typing methods were initially compared, including multi-locus sequence typing (MLST), single-copy core genes phylogenetic analysis (SCCGPA), dispensable genes content pattern analysis (DGCPA) and composition vector tree (CVTree), to analyze the genomic variability of 23 B. thuringiensis strains from aizawai, kurstaki, israelensis, thuringiensis and morrisoni serovars. The CVTree method was the best option to be used for typing B. thuringiensis strains since it proved to be the fastest method, whilst giving high-resolution data about the strains. In addition, CVTree agrees well with ANI-based method, revealing the relationship between B. thuringiensis and other B. cereus s.l. species. Based on these data, an online genome sequence comparison resource was built for Bacillus strains called the Bacillus Typing Bioinformatics Database to facilitate strain identification and characterization.


Asunto(s)
Bacillus anthracis , Bacillus thuringiensis , Bacillus , Humanos , Bacillus cereus/genética , Tipificación de Secuencias Multilocus , Filogenia , Bacillus/genética , Bacillus thuringiensis/genética
4.
Genomics Proteomics Bioinformatics ; 19(4): 662-667, 2021 08.
Artículo en Inglés | MEDLINE | ID: mdl-34119695

RESUMEN

Composition Vector Tree (CVTree) is an alignment-free algorithm to infer phylogenetic relationships from genome sequences. It has been successfully applied to study phylogeny and taxonomy of viruses, prokaryotes, and fungi based on the whole genomes, as well as chloroplast genomes, mitochondrial genomes, and metagenomes. Here we presented the standalone software for the CVTree algorithm. In the software, an extensible parallel workflow for the CVTree algorithm was designed. Based on the workflow, new alignment-free methods were also implemented. And by examining the phylogeny and taxonomy of 13,903 prokaryotes based on 16S rRNA sequences, we showed that CVTree software is an efficient and effective tool for studying phylogeny and taxonomy based on genome sequences. The code of CVTree software can be available at https://github.com/ghzuo/cvtree.


Asunto(s)
Genoma , Programas Informáticos , Algoritmos , Filogenia , ARN Ribosómico 16S/genética
5.
Int J Mol Sci ; 11(3): 1141-54, 2010 Mar 18.
Artículo en Inglés | MEDLINE | ID: mdl-20480005

RESUMEN

A shortcoming of most correlation distance methods based on the composition vectors without alignment developed for phylogenetic analysis using complete genomes is that the "distances" are not proper distance metrics in the strict mathematical sense. In this paper we propose two new correlation-related distance metrics to replace the old one in our dynamical language approach. Four genome datasets are employed to evaluate the effects of this replacement from a biological point of view. We find that the two proper distance metrics yield trees with the same or similar topologies as/to those using the old "distance" and agree with the tree of life based on 16S rRNA in a majority of the basic branches. Hence the two proper correlation-related distance metrics proposed here improve our dynamical language approach for phylogenetic analysis.


Asunto(s)
Algoritmos , Genómica/métodos , Filogenia , Alineación de Secuencia/métodos , Animales , Genoma Bacteriano , Genoma de Planta
6.
Chin Sci Bull ; 55(22): 2323-2328, 2010.
Artículo en Inglés | MEDLINE | ID: mdl-32214732

RESUMEN

The newly proposed alignment-free and parameter-free composition vector (CVtree) method has been successfully applied to infer phylogenetic relationship of viruses, chloroplasts, bacteria, and fungi from their whole-genome data. In this study we pay special attention to the phylogenetic positions of 56 Archaea genomes among which 7 species have not been listed either in Bergey's Manual of Systematic Bacteriology or in Taxonomic Outline of Bacteria and Archaea (TOBA). By inspecting the stable monophyletic branchings in CVTrees reconstructed from a total of 861 genomes (56 Archaea plus 797 Bacteria, using 8 Eukarya as outgroups) definite taxonomic assignments were proposed for these not-fully-classified species. Further development of Archaea taxonomy may verify the predicted phylogenetic results of the CVTree approach.

7.
Adv Genet ; 100: 211-266, 2017.
Artículo en Inglés | MEDLINE | ID: mdl-29153401

RESUMEN

Fungi are possibly the most diverse eukaryotic kingdom, with over a million member species and an evolutionary history dating back a billion years. Fungi have been at the forefront of eukaryotic genomics, and owing to initiatives like the 1000 Fungal Genomes Project the amount of fungal genomic data has increased considerably over the last 5 years, enabling large-scale comparative genomics of species across the kingdom. In this chapter, we first review fungal evolution and the history of fungal genomics. We then review in detail seven phylogenomic methods and reconstruct the phylogeny of 84 fungal species from 8 phyla using each method. Six methods have seen extensive use in previous fungal studies, while a Bayesian supertree method is novel to fungal phylogenomics. We find that both established and novel phylogenomic methods can accurately reconstruct the fungal kingdom. Finally, we discuss the accuracy and suitability of each phylogenomic method utilized.


Asunto(s)
Hongos/genética , Genoma Fúngico , Genómica , Filogenia , Evolución Molecular , Modelos Genéticos
8.
Synth Syst Biotechnol ; 2(3): 226-235, 2017 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-29318203

RESUMEN

A monospecific genus contains a single species ever since it was proposed. Though formally more than half of the known prokaryotic genera are monospecific, we pick up those which actually raise taxonomic problems by violating monophyly of the taxon within which it resides. Taking monophyly as a guiding principle, our arguments are based on simultaneous support from 16S rRNA sequence analysis and whole-genome phylogeny of prokaryotes, as provided by the LVTree Viewer and CVTree Web Server, respectively. The main purpose of this study consists in calling attention to this specific way of global taxonomic analysis. Therefore, we refrain from making formal emendations for the time being.

9.
Genomics Proteomics Bioinformatics ; 13(5): 321-31, 2015 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-26563468

RESUMEN

A faithful phylogeny and an objective taxonomy for prokaryotes should agree with each other and ultimately follow the genome data. With the number of sequenced genomes reaching tens of thousands, both tree inference and detailed comparison with taxonomy are great challenges. We now provide one solution in the latest Release 3.0 of the alignment-free and whole-genome-based web server CVTree3. The server resides in a cluster of 64 cores and is equipped with an interactive, collapsible, and expandable tree display. It is capable of comparing the tree branching order with prokaryotic classification at all taxonomic ranks from domains down to species and strains. CVTree3 allows for inquiry by taxon names and trial on lineage modifications. In addition, it reports a summary of monophyletic and non-monophyletic taxa at all ranks as well as produces print-quality subtree figures. After giving an overview of retrospective verification of the CVTree approach, the power of the new server is described for the mega-classification of prokaryotes and determination of taxonomic placement of some newly-sequenced genomes. A few discrepancies between CVTree and 16S rRNA analyses are also summarized with regard to possible taxonomic revisions. CVTree3 is freely accessible to all users at http://tlife.fudan.edu.cn/cvtree3/ without login requirements.


Asunto(s)
Archaea/clasificación , Bacterias/clasificación , Genoma Arqueal/genética , Genoma Bacteriano/genética , Internet , Archaea/genética , Bacterias/genética , Filogenia , ARN Ribosómico 16S/genética , Estudios Retrospectivos , Alineación de Secuencia/métodos
10.
Comput Biol Chem ; 53 Pt A: 166-73, 2014 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-25205031

RESUMEN

Using an enlarged alphabet of K-tuples is the way to carry out alignment-free comparison of genomes in the composition vector (CV) approach to prokaryotic phylogeny. We summarize the known aspects concerning the choice of K and examine the results of using CVs with subtraction of a statistical background for K=3-9 and using raw CVs without subtraction for K=1-12. The criterion for evaluation consists in direct comparison with taxonomy. For prokaryotes the best performances are obtained for K=5 and 6 with subtraction and for K=11, 12 or even more without subtraction. In general, CVs with subtractions are slightly better and less CPU consuming, but CVs without subtraction may provide complementary information.


Asunto(s)
Algoritmos , Archaea/clasificación , Bacterias/clasificación , Genoma Arqueal , Genoma Bacteriano , Filogenia , Archaea/genética , Proteínas Arqueales/química , Proteínas Arqueales/genética , Bacterias/genética , Proteínas Bacterianas/química , Proteínas Bacterianas/genética , Péptidos/química , Péptidos/genética , Análisis de Secuencia de ADN , Análisis de Secuencia de Proteína
SELECCIÓN DE REFERENCIAS
Detalles de la búsqueda