Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 17 de 17
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
Microbiome ; 7(1): 35, 2019 02 28.
Artículo en Inglés | MEDLINE | ID: mdl-30819245

RESUMEN

BACKGROUND: Microbial communities present in environmental waters constitute a reservoir for antibiotic-resistant pathogens that impact human health. For this reason, a diverse variety of water environments are being analyzed using metagenomics to uncover public health threats. However, the composition of these communities along the coastal environment of a whole city, where sewage and beach waters are mixed, is poorly understood. RESULTS: We shotgun-sequenced 20 coastal areas from the city of Montevideo (capital of Uruguay) including beach and sewage water samples to characterize bacterial communities and their virulence and antibiotic resistance repertories. As expected, we found that sewage and beach environments present significantly different bacterial communities. This baseline allowed us to detect a higher prevalence and a more diverse repertory of virulence and antibiotic-resistant genes in sewage samples. Many of these genes come from well-known enterobacteria and represent carbapenemases and extended-spectrum betalactamases reported in hospital infections in Montevideo. Additionally, we were able to genotype the presence of both globally disseminated pathogenic clones and emerging antibiotic-resistant bacteria in sewage waters. CONCLUSIONS: Our study represents the first in using metagenomics to jointly analyze beaches and the sewage system from an entire city, allowing us to characterize antibiotic-resistant pathogens circulating in urban waters. The data generated in this initial study represent a baseline metagenomic exploration to guide future longitudinal (time-wise) studies, whose systematic implementation will provide useful epidemiological information to improve public health surveillance.


Asunto(s)
Antibacterianos/farmacología , Bacterias/clasificación , Metagenómica/métodos , Aguas del Alcantarillado/microbiología , Bacterias/genética , Bacterias/aislamiento & purificación , Bacterias/patogenicidad , Proteínas Bacterianas/genética , Playas , Estudios Transversales , Farmacorresistencia Bacteriana , Humanos , Análisis de Secuencia de ADN , Uruguay , Microbiología del Agua
2.
Artículo en Inglés | MEDLINE | ID: mdl-30533746

RESUMEN

Metagenomics is providing a broad overview of bacterial functional diversity; however, culturing and biobanking are still essential for microbiology. Here, we present the Bacterial Biobank of the Urban Environment (BBUE), a sizable culture collection for long-term storage and characterization of the microbiota associated with urban environments relevant for public health.

3.
Nucleic Acids Res ; 46(D1): D477-D485, 2018 01 04.
Artículo en Inglés | MEDLINE | ID: mdl-29106550

RESUMEN

The Orthologous Matrix (OMA) is a leading resource to relate genes across many species from all of life. In this update paper, we review the recent algorithmic improvements in the OMA pipeline, describe increases in species coverage (particularly in plants and early-branching eukaryotes) and introduce several new features in the OMA web browser. Notable improvements include: (i) a scalable, interactive viewer for hierarchical orthologous groups; (ii) protein domain annotations and domain-based links between orthologous groups; (iii) functionality to retrieve phylogenetic marker genes for a subset of species of interest; (iv) a new synteny dot plot viewer; and (v) an overhaul of the programmatic access (REST API and semantic web), which will facilitate incorporation of OMA analyses in computational pipelines and integration with other bioinformatic resources. OMA can be freely accessed at https://omabrowser.org.


Asunto(s)
Evolución Biológica , Bases de Datos Genéticas , Genoma , Anotación de Secuencia Molecular , Proteínas/genética , Sintenía , Algoritmos , Animales , Archaea/clasificación , Archaea/genética , Archaea/metabolismo , Bacterias/clasificación , Bacterias/genética , Bacterias/metabolismo , Biología Computacional/métodos , Hongos/clasificación , Hongos/genética , Hongos/metabolismo , Ontología de Genes , Humanos , Internet , Filogenia , Plantas/clasificación , Plantas/genética , Plantas/metabolismo , Dominios Proteicos , Proteínas/química , Proteínas/metabolismo , Navegador Web
4.
Bioinformatics ; 33(14): i75-i82, 2017 Jul 15.
Artículo en Inglés | MEDLINE | ID: mdl-28881964

RESUMEN

MOTIVATION: Accurate orthology inference is a fundamental step in many phylogenetics and comparative analysis. Many methods have been proposed, including OMA (Orthologous MAtrix). Yet substantial challenges remain, in particular in coping with fragmented genes or genes evolving at different rates after duplication, and in scaling to large datasets. With more and more genomes available, it is necessary to improve the scalability and robustness of orthology inference methods. RESULTS: We present improvements in the OMA algorithm: (i) refining the pairwise orthology inference step to account for same-species paralogs evolving at different rates, and (ii) minimizing errors in the pairwise orthology verification step by testing the consistency of pairwise distance estimates, which can be problematic in the presence of fragmentary sequences. In addition we introduce a more scalable procedure for hierarchical orthologous group (HOG) clustering, which are several orders of magnitude faster on large datasets. Using the Quest for Orthologs consortium orthology benchmark service, we show that these changes translate into substantial improvement on multiple empirical datasets. AVAILABILITY AND IMPLEMENTATION: This new OMA 2.0 algorithm is used in the OMA database ( http://omabrowser.org ) from the March 2017 release onwards, and can be run on custom genomes using OMA standalone version 2.0 and above ( http://omabrowser.org/standalone ). CONTACT: christophe.dessimoz@unil.ch or adrian.altenhoff@inf.ethz.ch.


Asunto(s)
Evolución Molecular , Genómica/métodos , Tasa de Mutación , Filogenia , Programas Informáticos , Algoritmos , Animales , Humanos , Mamíferos/genética
5.
Nucleic Acids Res ; 43(Database issue): D240-9, 2015 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-25399418

RESUMEN

The Orthologous Matrix (OMA) project is a method and associated database inferring evolutionary relationships amongst currently 1706 complete proteomes (i.e. the protein sequence associated for every protein-coding gene in all genomes). In this update article, we present six major new developments in OMA: (i) a new web interface; (ii) Gene Ontology function predictions as part of the OMA pipeline; (iii) better support for plant genomes and in particular homeologs in the wheat genome; (iv) a new synteny viewer providing the genomic context of orthologs; (v) statically computed hierarchical orthologous groups subsets downloadable in OrthoXML format; and (vi) possibility to export parts of the all-against-all computations and to combine them with custom data for 'client-side' orthology prediction. OMA can be accessed through the OMA Browser and various programmatic interfaces at http://omabrowser.org.


Asunto(s)
Bases de Datos de Proteínas , Proteínas de Plantas/genética , Proteoma/química , Homología de Secuencia de Aminoácido , Algoritmos , Ontología de Genes , Genoma de Planta , Humanos , Internet , Proteínas de Plantas/química , Proteoma/genética , Sintenía , Triticum/genética
6.
PLoS One ; 8(2): e56925, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-23451112

RESUMEN

The identification of orthologous genes, a prerequisite for numerous analyses in comparative and functional genomics, is commonly performed computationally from protein sequences. Several previous studies have compared the accuracy of orthology inference methods, but simulated data has not typically been considered in cross-method assessment studies. Yet, while dependent on model assumptions, simulation-based benchmarking offers unique advantages: contrary to empirical data, all aspects of simulated data are known with certainty. Furthermore, the flexibility of simulation makes it possible to investigate performance factors in isolation of one another.Here, we use simulated data to dissect the performance of six methods for orthology inference available as standalone software packages (Inparanoid, OMA, OrthoInspector, OrthoMCL, QuartetS, SPIMAP) as well as two generic approaches (bidirectional best hit and reciprocal smallest distance). We investigate the impact of various evolutionary forces (gene duplication, insertion, deletion, and lateral gene transfer) and technological artefacts (ambiguous sequences) on orthology inference. We show that while gene duplication/loss and insertion/deletion are well handled by most methods (albeit for different trade-offs of precision and recall), lateral gene transfer disrupts all methods. As for ambiguous sequences, which might result from poor sequencing, assembly, or genome annotation, we show that they affect alignment score-based orthology methods more strongly than their distance-based counterparts.


Asunto(s)
Duplicación de Gen/genética , Transferencia de Gen Horizontal/genética , Mutagénesis Insercional/genética , Genómica/métodos
7.
PLoS One ; 8(1): e53786, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-23342000

RESUMEN

Hierarchical orthologous groups are defined as sets of genes that have descended from a single common ancestor within a taxonomic range of interest. Identifying such groups is useful in a wide range of contexts, including inference of gene function, study of gene evolution dynamics and comparative genomics. Hierarchical orthologous groups can be derived from reconciled gene/species trees but, this being a computationally costly procedure, many phylogenomic databases work on the basis of pairwise gene comparisons instead ("graph-based" approach). To our knowledge, there is only one published algorithm for graph-based hierarchical group inference, but both its theoretical justification and performance in practice are as of yet largely uncharacterised. We establish a formal correspondence between the orthology graph and hierarchical orthologous groups. Based on that, we devise GETHOGs ("Graph-based Efficient Technique for Hierarchical Orthologous Groups"), a novel algorithm to infer hierarchical groups directly from the orthology graph, thus without needing gene tree inference nor gene/species tree reconciliation. GETHOGs is shown to correctly reconstruct hierarchical orthologous groups when applied to perfect input, and several extensions with stringency parameters are provided to deal with imperfect input data. We demonstrate its competitiveness using both simulated and empirical data. GETHOGs is implemented as a part of the freely-available OMA standalone package (http://omabrowser.org/standalone). Furthermore, hierarchical groups inferred by GETHOGs ("OMA HOGs") on >1,000 genomes can be interactively queried via the OMA browser (http://omabrowser.org).


Asunto(s)
Algoritmos , Genómica/métodos , Homología de Secuencia de Ácido Nucleico , Bases de Datos Genéticas , Filogenia
8.
BMC Bioinformatics ; 13: 148, 2012 Jun 27.
Artículo en Inglés | MEDLINE | ID: mdl-22738078

RESUMEN

BACKGROUND: We analyze phylogenetic tree building methods from molecular sequences (PTMS). These are methods which base their construction solely on sequences, coding DNA or amino acids. RESULTS: Our first result is a statistically significant evaluation of 176 PTMSs done by comparing trees derived from 193138 orthologous groups of proteins using a new measure of quality between trees. This new measure, called the Intra measure, is very consistent between different groups of species and strong in the sense that it separates the methods with high confidence. The second result is the comparison of the trees against trees derived from accepted taxonomies, the Taxon measure. We consider the NCBI taxonomic classification and their derived topologies as the most accepted biological consensus on phylogenies, which are also available in electronic form. The correlation between the two measures is remarkably high, which supports both measures simultaneously. CONCLUSIONS: The big surprise of the evaluation is that the maximum likelihood methods do not score well, minimal evolution distance methods over MSA-induced alignments score consistently better. This comparison also allows us to rank different components of the tree building methods, like MSAs, substitution matrices, ML tree builders, distance methods, etc. It is also clear that there is a difference between Metazoa and the rest, which points out to evolution leaving different molecular traces. We also think that these measures of quality of trees will motivate the design of new PTMSs as it is now easier to evaluate them with certainty.


Asunto(s)
Filogenia , Análisis de Secuencia de ADN , Análisis de Secuencia de Proteína , Evolución Molecular , Funciones de Verosimilitud , Alineación de Secuencia/métodos
9.
Mol Biol Evol ; 29(4): 1115-23, 2012 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-22160766

RESUMEN

In computational evolutionary biology, verification and benchmarking is a challenging task because the evolutionary history of studied biological entities is usually not known. Computer programs for simulating sequence evolution in silico have shown to be viable test beds for the verification of newly developed methods and to compare different algorithms. However, current simulation packages tend to focus either on gene-level aspects of genome evolution such as character substitutions and insertions and deletions (indels) or on genome-level aspects such as genome rearrangement and speciation events. Here, we introduce Artificial Life Framework (ALF), which aims at simulating the entire range of evolutionary forces that act on genomes: nucleotide, codon, or amino acid substitution (under simple or mixture models), indels, GC-content amelioration, gene duplication, gene loss, gene fusion, gene fission, genome rearrangement, lateral gene transfer (LGT), or speciation. The other distinctive feature of ALF is its user-friendly yet powerful web interface. We illustrate the utility of ALF with two possible applications: 1) we reanalyze data from a study of selection after globin gene duplication and test the statistical significance of the original conclusions and 2) we demonstrate that LGT can dramatically decrease the accuracy of two well-established orthology inference methods. ALF is available as a stand-alone application or via a web interface at http://www.cbrg.ethz.ch/alf.


Asunto(s)
Biología Computacional/métodos , Evolución Molecular , Genoma , Modelos Genéticos , Programas Informáticos , Algoritmos , Composición de Base , Simulación por Computador , Transferencia de Gen Horizontal , Especiación Genética , Mutagénesis
10.
Nucleic Acids Res ; 39(Database issue): D289-94, 2011 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-21113020

RESUMEN

OMA (Orthologous MAtrix) is a database that identifies orthologs among publicly available, complete genomes. Initiated in 2004, the project is at its 11th release. It now includes 1000 genomes, making it one of the largest resources of its kind. Here, we describe recent developments in terms of species covered; the algorithmic pipeline--in particular regarding the treatment of alternative splicing, and new features of the web (OMA Browser) and programming interface (SOAP API). In the second part, we review the various representations provided by OMA and their typical applications. The database is publicly accessible at http://omabrowser.org.


Asunto(s)
Bases de Datos Genéticas , Genoma , Algoritmos , Empalme Alternativo , Evolución Molecular , Genes , Filogenia , Interfaz Usuario-Computador
11.
Genome Biol Evol ; 1: 114-8, 2009 Jun 05.
Artículo en Inglés | MEDLINE | ID: mdl-20333182

RESUMEN

Published estimates of the proportion of positively selected genes (PSGs) in human vary over three orders of magnitude. In mammals, estimates of the proportion of PSGs cover an even wider range of values. We used 2,980 orthologous protein-coding genes from human, chimpanzee, macaque, dog, cow, rat, and mouse as well as an established phylogenetic topology to infer the fraction of PSGs in all seven terminal branches. The inferred fraction of PSGs ranged from 0.9% in human through 17.5% in macaque to 23.3% in dog. We found three factors that influence the fraction of genes that exhibit telltale signs of positive selection: the quality of the sequence, the degree of misannotation, and ambiguities in the multiple sequence alignment. The inferred fraction of PSGs in sequences that are deficient in all three criteria of coverage, annotation, and alignment is 7.2 times higher than that in genes with high trace sequencing coverage, "known" annotation status, and perfect alignment scores. We conclude that some estimates on the prevalence of positive Darwinian selection in the literature may be inflated and should be treated with caution.

12.
BMC Bioinformatics ; 9: 518, 2008 Dec 04.
Artículo en Inglés | MEDLINE | ID: mdl-19055798

RESUMEN

BACKGROUND: OMA is a project that aims to identify orthologs within publicly available, complete genomes. With 657 genomes analyzed to date, OMA is one of the largest projects of its kind. RESULTS: The algorithm of OMA improves upon standard bidirectional best-hit approach in several respects: it uses evolutionary distances instead of scores, considers distance inference uncertainty, includes many-to-many orthologous relations, and accounts for differential gene losses. Herein, we describe in detail the algorithm for inference of orthology and provide the rationale for parameter selection through multiple tests. CONCLUSION: OMA contains several novel improvement ideas for orthology inference and provides a unique dataset of large-scale orthology assignments.


Asunto(s)
Algoritmos , Genómica , Biología Computacional/métodos , Evolución Molecular , Alineación de Secuencia
13.
Bioinformatics ; 23(16): 2180-2, 2007 Aug 15.
Artículo en Inglés | MEDLINE | ID: mdl-17545180

RESUMEN

MOTIVATION: Inference of the evolutionary relation between proteins, in particular the identification of orthologs, is a central problem in comparative genomics. Several large-scale efforts with various methodologies and scope tackle this problem, including OMA (the Orthologous MAtrix project). RESULTS: Based on the results of the OMA project, we introduce here the OMA Browser, a web-based tool allowing the exploration of orthologous relations over 352 complete genomes. Orthologs can be viewed as groups across species, but also at the level of sequence pairs, allowing the distinction among one-to-one, one-to-many and many-to-many orthologs. AVAILABILITY: http://omabrowser.org.


Asunto(s)
Mapeo Cromosómico/métodos , Secuencia Conservada/genética , Alineación de Secuencia/métodos , Análisis de Secuencia de ADN/métodos , Homología de Secuencia de Ácido Nucleico , Programas Informáticos , Disparidad de Par Base/genética , Secuencia de Bases , Gráficos por Computador , Evolución Molecular , Datos de Secuencia Molecular
14.
BMC Bioinformatics ; 7: 529, 2006 Dec 05.
Artículo en Inglés | MEDLINE | ID: mdl-17147817

RESUMEN

BACKGROUND: The estimation of the difference between two evolutionary distances within a triplet of homologs is a common operation that is used for example to determine which of two sequences is closer to a third one. The most accurate method is currently maximum likelihood over the entire triplet. However, this approach is relatively time consuming. RESULTS: We show that an alternative estimator, based on pairwise estimates and therefore much faster to compute, has almost the same statistical power as the maximum likelihood estimator. We also provide a numerical approximation for its variance, which could otherwise only be estimated through an expensive re-sampling approach such as bootstrapping. An extensive simulation demonstrates that the approximation delivers precise confidence intervals. To illustrate the possible applications of these results, we show how they improve the detection of asymmetric evolution, and the identification of the closest relative to a given sequence in a group of homologs. CONCLUSION: The results presented in this paper constitute a basis for large-scale protein cross-comparisons of pairwise evolutionary distances.


Asunto(s)
Algoritmos , Mapeo Cromosómico/métodos , Evolución Molecular , Desequilibrio de Ligamiento/genética , Proteínas/genética , Alineación de Secuencia/métodos , Análisis de Secuencia de ADN/métodos , Secuencia de Bases , Datos de Secuencia Molecular , Homología de Secuencia de Ácido Nucleico
15.
Nucleic Acids Res ; 34(11): 3309-16, 2006.
Artículo en Inglés | MEDLINE | ID: mdl-16835308

RESUMEN

Correct orthology assignment is a critical prerequisite of numerous comparative genomics procedures, such as function prediction, construction of phylogenetic species trees and genome rearrangement analysis. We present an algorithm for the detection of non-orthologs that arise by mistake in current orthology classification methods based on genome-specific best hits, such as the COGs database. The algorithm works with pairwise distance estimates, rather than computationally expensive and error-prone tree-building methods. The accuracy of the algorithm is evaluated through verification of the distribution of predicted cases, case-by-case phylogenetic analysis and comparisons with predictions from other projects using independent methods. Our results show that a very significant fraction of the COG groups include non-orthologs: using conservative parameters, the algorithm detects non-orthology in a third of all COG groups. Consequently, sequence analysis sensitive to correct orthology assignments will greatly benefit from these findings.


Asunto(s)
Algoritmos , Bases de Datos de Proteínas , Genómica/métodos , Evolución Molecular , Filogenia , Proteínas/clasificación , Proteínas/genética , Alineación de Secuencia , Análisis de Secuencia de Proteína
16.
J Bioinform Comput Biol ; 3(6): 1429-40, 2005 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-16374915

RESUMEN

We present a dimensionless fit index for phylogenetic trees that have been constructed from distance matrices. It is designed to measure the quality of the fit of the data to a tree in absolute terms, independent of linear transformations on the distance matrix. The index can be used as an absolute measure to evaluate how well a set of data fits to a tree, or as a relative measure to compare different methods that are expected to produce the same tree. The usefulness of the index is demonstrated in three examples.


Asunto(s)
Algoritmos , Evolución Biológica , Evolución Molecular , Modelos Genéticos , Modelos Estadísticos , Análisis Numérico Asistido por Computador , Filogenia , Interpretación Estadística de Datos
17.
BMC Bioinformatics ; 6: 134, 2005 Jun 01.
Artículo en Inglés | MEDLINE | ID: mdl-15927081

RESUMEN

BACKGROUND: Codon substitution probabilities are used in many types of molecular evolution studies such as determining Ka/Ks ratios, creating ancestral DNA sequences or aligning coding DNA. Until the recent dramatic increase in genomic data enabled construction of empirical matrices, researchers relied on parameterized models of codon evolution. Here we present the first empirical codon substitution matrix entirely built from alignments of coding sequences from vertebrate DNA and thus provide an alternative to parameterized models of codon evolution. RESULTS: A set of 17,502 alignments of orthologous sequences from five vertebrate genomes yielded 8.3 million aligned codons from which the number of substitutions between codons were counted. From this data, both a probability matrix and a matrix of similarity scores were computed. They are 64 x 64 matrices describing the substitutions between all codons. Substitutions from sense codons to stop codons are not considered, resulting in block diagonal matrices consisting of 61 x 61 entries for the sense codons and 3 x 3 entries for the stop codons. CONCLUSION: The amount of genomic data currently available allowed for the construction of an empirical codon substitution matrix. However, more sequence data is still needed to construct matrices from different subsets of DNA, specific to kingdoms, evolutionary distance or different amount of synonymous change. Codon mutation matrices have advantages for alignments up to medium evolutionary distances and for usages that require DNA such as ancestral reconstruction of DNA sequences and the calculation of Ka/Ks ratios.


Asunto(s)
Codón , Biología Computacional/métodos , Modelos Genéticos , Sustitución de Aminoácidos , Animales , Secuencia de Bases , Evolución Biológica , Pollos , Simulación por Computador , Evolución Molecular , Humanos , Funciones de Verosimilitud , Ratones , Modelos Estadísticos , Mutación , Filogenia , Alineación de Secuencia , Análisis de Secuencia de ADN , Programas Informáticos , Especificidad de la Especie , Xenopus , Pez Cebra
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...