Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 101
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
PLoS Genet ; 19(3): e1010683, 2023 03.
Artigo em Inglês | MEDLINE | ID: mdl-36972309

RESUMO

Prokaryotic evolution is influenced by the exchange of genetic information between species through a process referred to as recombination. The rate of recombination is a useful measure for the adaptive capacity of a prokaryotic population. We introduce Rhometa (https://github.com/sid-krish/Rhometa), a new software package to determine recombination rates from shotgun sequencing reads of metagenomes. It extends the composite likelihood approach for population recombination rate estimation and enables the analysis of modern short-read datasets. We evaluated Rhometa over a broad range of sequencing depths and complexities, using simulated and real experimental short-read data aligned to external reference genomes. Rhometa offers a comprehensive solution for determining population recombination rates from contemporary metagenomic read datasets. Rhometa extends the capabilities of conventional sequence-based composite likelihood population recombination rate estimators to include modern aligned metagenomic read datasets with diverse sequencing depths, thereby enabling the effective application of these techniques and their high accuracy rates to the field of metagenomics. Using simulated datasets, we show that our method performs well, with its accuracy improving with increasing numbers of genomes. Rhometa was validated on a real S. pneumoniae transformation experiment, where we show that it obtains plausible estimates of the rate of recombination. Finally, the program was also run on ocean surface water metagenomic datasets, through which we demonstrate that the program works on uncultured metagenomic datasets.


Assuntos
Metagenoma , Metagenômica , Metagenômica/métodos , Metagenoma/genética , Análise de Sequência de DNA/métodos , Funções Verossimilhança , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Software , Recombinação Genética/genética , Algoritmos
2.
Nat Methods ; 19(4): 429-440, 2022 04.
Artigo em Inglês | MEDLINE | ID: mdl-35396482

RESUMO

Evaluating metagenomic software is key for optimizing metagenome interpretation and focus of the Initiative for the Critical Assessment of Metagenome Interpretation (CAMI). The CAMI II challenge engaged the community to assess methods on realistic and complex datasets with long- and short-read sequences, created computationally from around 1,700 new and known genomes, as well as 600 new plasmids and viruses. Here we analyze 5,002 results by 76 program versions. Substantial improvements were seen in assembly, some due to long-read data. Related strains still were challenging for assembly and genome recovery through binning, as was assembly quality for the latter. Profilers markedly matured, with taxon profilers and binners excelling at higher bacterial ranks, but underperforming for viruses and Archaea. Clinical pathogen detection results revealed a need to improve reproducibility. Runtime and memory usage analyses identified efficient programs, including top performers with other metrics. The results identify challenges and guide researchers in selecting methods for analyses.


Assuntos
Metagenoma , Metagenômica , Archaea/genética , Metagenômica/métodos , Reprodutibilidade dos Testes , Análise de Sequência de DNA , Software
3.
PLoS Comput Biol ; 19(4): e1011084, 2023 04.
Artigo em Inglês | MEDLINE | ID: mdl-37099595

RESUMO

Bayesian inference for phylogenetics is a gold standard for computing distributions of phylogenies. However, Bayesian phylogenetics faces the challenging computational problem of moving throughout the high-dimensional space of trees. Fortunately, hyperbolic space offers a low dimensional representation of tree-like data. In this paper, we embed genomic sequences as points in hyperbolic space and perform hyperbolic Markov Chain Monte Carlo for Bayesian inference in this space. The posterior probability of an embedding is computed by decoding a neighbour-joining tree from the embedding locations of the sequences. We empirically demonstrate the fidelity of this method on eight data sets. We systematically investigated the effect of embedding dimension and hyperbolic curvature on the performance in these data sets. The sampled posterior distribution recovers the splits and branch lengths to a high degree over a range of curvatures and dimensions. We systematically investigated the effects of the embedding space's curvature and dimension on the Markov Chain's performance, demonstrating the suitability of hyperbolic space for phylogenetic inference.


Assuntos
Filogenia , Teorema de Bayes , Algoritmos
4.
Artigo em Inglês | MEDLINE | ID: mdl-38536071

RESUMO

Five bacterial isolates were isolated from Fragaria × ananassa in 1976 in Rydalmere, Australia, during routine biosecurity surveillance. Initially, the results of biochemical characterisation indicated that these isolates represented members of the genus Xanthomonas. To determine their species, further analysis was conducted using both phenotypic and genotypic approaches. Phenotypic analysis involved using MALDI-TOF MS and BIOLOG GEN III microplates, which confirmed that the isolates represented members of the genus Xanthomonas but did not allow them to be classified with respect to species. Genome relatedness indices and the results of extensive phylogenetic analysis confirmed that the isolates were members of the genus Xanthomonas and represented a novel species. On the basis the minimal presence of virulence-associated factors typically found in genomes of members of the genus Xanthomonas, we suggest that these isolates are non-pathogenic. This conclusion was supported by the results of a pathogenicity assay. On the basis of these findings, we propose the name Xanthomonas rydalmerensis, with DAR 34855T = ICMP 24941 as the type strain.


Assuntos
Fragaria , Xanthomonas , Filogenia , Análise de Sequência de DNA , RNA Ribossômico 16S/genética , DNA Bacteriano/genética , Técnicas de Tipagem Bacteriana , Composição de Bases , Ácidos Graxos/química
5.
Genome Res ; 30(2): 239-249, 2020 02.
Artigo em Inglês | MEDLINE | ID: mdl-32051187

RESUMO

Understanding the genetic basis for a phenotype is a central goal in biological research. Much has been learnt about bacterial genomes by creating large mutant libraries and looking for conditionally important genes. However, current genome-wide methods are largely unable to assay essential genes which are not amenable to disruption. To overcome this limitation, we developed a new version of "TraDIS" (transposon directed insertion-site sequencing) that we term "TraDIS-Xpress" that combines an inducible promoter into the transposon cassette. This allows controlled overexpression and repression of all genes owing to saturation of inserts adjacent to all open reading frames as well as conventional inactivation. We applied TraDIS-Xpress to identify responses to the biocide triclosan across a range of concentrations. Triclosan is endemic in modern life, but there is uncertainty about its mode of action with a concentration-dependent switch from bacteriostatic to bactericidal action unexplained. Our results show a concentration-dependent response to triclosan with different genes important in survival between static and cidal exposures. These genes include those previously reported to have a role in triclosan resistance as well as a new set of genes, including essential genes. Novel genes identified as being sensitive to triclosan exposure include those involved in barrier function, small molecule uptake, and integrity of transcription and translation. We anticipate the approach we show here, by allowing comparisons across multiple experimental conditions of TraDIS data, and including essential genes, will be a starting point for future work examining how different drug conditions impact bacterial survival mechanisms.


Assuntos
Elementos de DNA Transponíveis/genética , Genes Essenciais/genética , Genoma Bacteriano/efeitos dos fármacos , Triclosan/farmacologia , Escherichia coli/efeitos dos fármacos , Escherichia coli/genética , Biblioteca Gênica , Genes Essenciais/efeitos dos fármacos , Mutagênese Insercional/efeitos dos fármacos , Proteínas Mutantes/efeitos dos fármacos , Proteínas Mutantes/genética , Fenótipo
6.
PLoS Comput Biol ; 17(10): e1008839, 2021 10.
Artigo em Inglês | MEDLINE | ID: mdl-34634030

RESUMO

Hi-C is a sample preparation method that enables high-throughput sequencing to capture genome-wide spatial interactions between DNA molecules. The technique has been successfully applied to solve challenging problems such as 3D structural analysis of chromatin, scaffolding of large genome assemblies and more recently the accurate resolution of metagenome-assembled genomes (MAGs). Despite continued refinements, however, preparing a Hi-C library remains a complex laboratory protocol. To avoid costly failures and maximise the odds of successful outcomes, diligent quality management is recommended. Current wet-lab methods provide only a crude assay of Hi-C library quality, while key post-sequencing quality indicators used have-thus far-relied upon reference-based read-mapping. When a reference is accessible, this reliance introduces a concern for quality, where an incomplete or inexact reference skews the resulting quality indicators. We propose a new, reference-free approach that infers the total fraction of read-pairs that are a product of proximity ligation. This quantification of Hi-C library quality requires only a modest amount of sequencing data and is independent of other application-specific criteria. The algorithm builds upon the observation that proximity ligation events are likely to create k-mers that would not naturally occur in the sample. Our software tool (qc3C) is to our knowledge the first to implement a reference-free Hi-C QC tool, and also provides reference-based QC, enabling Hi-C to be more easily applied to non-model organisms and environmental samples. We characterise the accuracy of the new algorithm on simulated and real datasets and compare it to reference-based methods.


Assuntos
Mapeamento Cromossômico , Genômica , Sequenciamento de Nucleotídeos em Larga Escala , Controle de Qualidade , Software , Algoritmos , Animais , Mapeamento Cromossômico/métodos , Mapeamento Cromossômico/normas , DNA/química , DNA/genética , Biblioteca Gênica , Genômica/métodos , Genômica/normas , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Sequenciamento de Nucleotídeos em Larga Escala/normas , Humanos , Tartarugas
7.
J Allergy Clin Immunol ; 147(3): 1041-1048, 2021 03.
Artigo em Inglês | MEDLINE | ID: mdl-32650022

RESUMO

BACKGROUND: Human milk oligosaccharides (HMO) are a diverse range of sugars secreted in breast milk that have direct and indirect effects on immunity. The profiles of HMOs produced differ between mothers. OBJECTIVE: We sought to determine the relationship between maternal HMO profiles and offspring allergic diseases up to age 18 years. METHODS: Colostrum and early lactation milk samples were collected from 285 mothers enrolled in a high-allergy-risk birth cohort, the Melbourne Atopy Cohort Study. Nineteen HMOs were measured. Profiles/patterns of maternal HMOs were determined using LCA. Details of allergic disease outcomes including sensitization, wheeze, asthma, and eczema were collected at multiple follow-ups up to age 18 years. Adjusted logistic regression analyses and generalized estimating equations were used to determine the relationship between HMO profiles and allergy. RESULTS: The levels of several HMOs were highly correlated with each other. LCA determined 7 distinct maternal milk profiles with memberships of 10% and 20%. Compared with offspring exposed to the neutral Lewis HMO profile, exposure to acidic Lewis HMOs was associated with a higher risk of allergic disease and asthma over childhood (odds ratio asthma at 18 years, 5.82; 95% CI, 1.59-21.23), whereas exposure to the acidic-predominant profile was associated with a reduced risk of food sensitization (OR at 12 years, 0.08; 95% CI, 0.01-0.67). CONCLUSIONS: In this high-allergy-risk birth cohort, some profiles of HMOs were associated with increased and some with decreased allergic disease risks over childhood. Further studies are needed to confirm these findings and realize the potential for intervention.


Assuntos
Asma/epidemiologia , Colostro/metabolismo , Eczema/epidemiologia , Hipersensibilidade Alimentar/epidemiologia , Leite Humano/metabolismo , Oligossacarídeos/metabolismo , Adolescente , Austrália/epidemiologia , Criança , Pré-Escolar , Feminino , Seguimentos , Humanos , Lactente , Recém-Nascido , Lactação , Masculino , Sons Respiratórios , Risco
8.
PLoS Biol ; 16(8): e2006352, 2018 08.
Artigo em Inglês | MEDLINE | ID: mdl-30086128

RESUMO

Plants are associated with a complex microbiota that contributes to nutrient acquisition, plant growth, and plant defense. Nitrogen-fixing microbial associations are efficient and well characterized in legumes but are limited in cereals, including maize. We studied an indigenous landrace of maize grown in nitrogen-depleted soils in the Sierra Mixe region of Oaxaca, Mexico. This landrace is characterized by the extensive development of aerial roots that secrete a carbohydrate-rich mucilage. Analysis of the mucilage microbiota indicated that it was enriched in taxa for which many known species are diazotrophic, was enriched for homologs of genes encoding nitrogenase subunits, and harbored active nitrogenase activity as assessed by acetylene reduction and 15N2 incorporation assays. Field experiments in Sierra Mixe using 15N natural abundance or 15N-enrichment assessments over 5 years indicated that atmospheric nitrogen fixation contributed 29%-82% of the nitrogen nutrition of Sierra Mixe maize.


Assuntos
Microbiota/genética , Fixação de Nitrogênio/fisiologia , Nitrogênio/metabolismo , Zea mays/metabolismo , México , Microbiota/fisiologia , Filogenia , Desenvolvimento Vegetal , Mucilagem Vegetal/metabolismo , Raízes de Plantas/metabolismo , Polissacarídeos/metabolismo , Solo , Microbiologia do Solo
9.
Syst Biol ; 68(6): 1052-1061, 2019 11 01.
Artigo em Inglês | MEDLINE | ID: mdl-31034053

RESUMO

BEAGLE is a high-performance likelihood-calculation library for phylogenetic inference. The BEAGLE library defines a simple, but flexible, application programming interface (API), and includes a collection of efficient implementations for calculation under a variety of evolutionary models on different hardware devices. The library has been integrated into recent versions of popular phylogenetics software packages including BEAST and MrBayes and has been widely used across a diverse range of evolutionary studies. Here, we present BEAGLE 3 with new parallel implementations, increased performance for challenging data sets, improved scalability, and better usability. We have added new OpenCL and central processing unit-threaded implementations to the library, allowing the effective utilization of a wider range of modern hardware. Further, we have extended the API and library to support concurrent computation of independent partial likelihood arrays, for increased performance of nucleotide-model analyses with greater flexibility of data partitioning. For better scalability and usability, we have improved how phylogenetic software packages use BEAGLE in multi-GPU (graphics processing unit) and cluster environments, and introduced an automated method to select the fastest device given the data set, evolutionary model, and hardware. For application developers who wish to integrate the library, we also have developed an online tutorial. To evaluate the effect of the improvements, we ran a variety of benchmarks on state-of-the-art hardware. For a partitioned exemplar analysis, we observe run-time performance improvements as high as 5.9-fold over our previous GPU implementation. BEAGLE 3 is free, open-source software licensed under the Lesser GPL and available at https://beagle-dev.github.io.


Assuntos
Classificação/métodos , Software/normas , Interpretação Estatística de Dados , Filogenia
10.
Plasmid ; 102: 56-61, 2019 03.
Artigo em Inglês | MEDLINE | ID: mdl-30885788

RESUMO

IncHI2-ST1 plasmids play an important role in co-mobilizing genes conferring resistance to critically important antibiotics and heavy metals. Here we present the identification and analysis of IncHI2-ST1 plasmid pSPRC-Echo1, isolated from an Enterobacter hormaechei strain from a Sydney hospital, which predates other multi-drug resistant IncHI2-ST1 plasmids reported from Australia. Our time-resolved phylogeny analysis indicates pSPRC-Echo1 represents a new lineage of IncHI2-ST1 plasmids and show how their diversification relates to the era of antibiotics.


Assuntos
Filogenia , Plasmídeos/genética , Mapeamento Cromossômico , Elementos de DNA Transponíveis/genética , Fatores de Tempo
11.
Syst Biol ; 67(3): 503-517, 2018 05 01.
Artigo em Inglês | MEDLINE | ID: mdl-29244177

RESUMO

Phylogenetics, the inference of evolutionary trees from molecular sequence data such as DNA, is an enterprise that yields valuable evolutionary understanding of many biological systems. Bayesian phylogenetic algorithms, which approximate a posterior distribution on trees, have become a popular if computationally expensive means of doing phylogenetics. Modern data collection technologies are quickly adding new sequences to already substantial databases. With all current techniques for Bayesian phylogenetics, computation must start anew each time a sequence becomes available, making it costly to maintain an up-to-date estimate of a phylogenetic posterior. These considerations highlight the need for an online Bayesian phylogenetic method which can update an existing posterior with new sequences. Here, we provide theoretical results on the consistency and stability of methods for online Bayesian phylogenetic inference based on Sequential Monte Carlo (SMC) and Markov chain Monte Carlo. We first show a consistency result, demonstrating that the method samples from the correct distribution in the limit of a large number of particles. Next, we derive the first reported set of bounds on how phylogenetic likelihood surfaces change when new sequences are added. These bounds enable us to characterize the theoretical performance of sampling algorithms by bounding the effective sample size (ESS) with a given number of particles from below. We show that the ESS is guaranteed to grow linearly as the number of particles in an SMC sampler grows. Surprisingly, this result holds even though the dimensions of the phylogenetic model grow with each new added sequence.


Assuntos
Classificação/métodos , Modelos Biológicos , Filogenia , Algoritmos , Teorema de Bayes , Método de Monte Carlo
12.
Syst Biol ; 67(3): 490-502, 2018 May 01.
Artigo em Inglês | MEDLINE | ID: mdl-29186587

RESUMO

Modern infectious disease outbreak surveillance produces continuous streams of sequence data which require phylogenetic analysis as data arrives. Current software packages for Bayesian phylogenetic inference are unable to quickly incorporate new sequences as they become available, making them less useful for dynamically unfolding evolutionary stories. This limitation can be addressed by applying a class of Bayesian statistical inference algorithms called sequential Monte Carlo (SMC) to conduct online inference, wherein new data can be continuously incorporated to update the estimate of the posterior probability distribution. In this article, we describe and evaluate several different online phylogenetic sequential Monte Carlo (OPSMC) algorithms. We show that proposing new phylogenies with a density similar to the Bayesian prior suffers from poor performance, and we develop "guided" proposals that better match the proposal density to the posterior. Furthermore, we show that the simplest guided proposals can exhibit pathological behavior in some situations, leading to poor results, and that the situation can be resolved by heating the proposal density. The results demonstrate that relative to the widely used MCMC-based algorithm implemented in MrBayes, the total time required to compute a series of phylogenetic posteriors as sequences arrive can be significantly reduced by the use of OPSMC, without incurring a significant loss in accuracy.


Assuntos
Classificação/métodos , Modelos Biológicos , Filogenia , Algoritmos , Teorema de Bayes , Internet , Método de Monte Carlo
13.
Nature ; 499(7459): 431-7, 2013 Jul 25.
Artigo em Inglês | MEDLINE | ID: mdl-23851394

RESUMO

Genome sequencing enhances our understanding of the biological world by providing blueprints for the evolutionary and functional diversity that shapes the biosphere. However, microbial genomes that are currently available are of limited phylogenetic breadth, owing to our historical inability to cultivate most microorganisms in the laboratory. We apply single-cell genomics to target and sequence 201 uncultivated archaeal and bacterial cells from nine diverse habitats belonging to 29 major mostly uncharted branches of the tree of life, so-called 'microbial dark matter'. With this additional genomic information, we are able to resolve many intra- and inter-phylum-level relationships and to propose two new superphyla. We uncover unexpected metabolic features that extend our understanding of biology and challenge established boundaries between the three domains of life. These include a novel amino acid use for the opal stop codon, an archaeal-type purine synthesis in Bacteria and complete sigma factors in Archaea similar to those in Bacteria. The single-cell genomes also served to phylogenetically anchor up to 20% of metagenomic reads in some habitats, facilitating organism-level interpretation of ecosystem function. This study greatly expands the genomic representation of the tree of life and provides a systematic step towards a better understanding of biological evolution on our planet.


Assuntos
Archaea/classificação , Archaea/genética , Bactérias/classificação , Bactérias/genética , Metagenômica , Filogenia , Archaea/isolamento & purificação , Archaea/metabolismo , Bactérias/isolamento & purificação , Bactérias/metabolismo , Ecossistema , Genoma Arqueal/genética , Genoma Bacteriano/genética , Metagenoma/genética , Dados de Sequência Molecular , Análise de Sequência de DNA , Análise de Célula Única
14.
BMC Genomics ; 19(1): 298, 2018 Apr 27.
Artigo em Inglês | MEDLINE | ID: mdl-29703152

RESUMO

BACKGROUND: Theileria orientalis (Apicomplexa: Piroplasmida) has caused clinical disease in cattle of Eastern Asia for many years and its recent rapid spread throughout Australian and New Zealand herds has caused substantial economic losses to production through cattle deaths, late term abortion and morbidity. Disease outbreaks have been linked to the detection of a pathogenic genotype of T. orientalis, genotype Ikeda, which is also responsible for disease outbreaks in Asia. Here, we sequenced and compared the draft genomes of one pathogenic (Ikeda) and two apathogenic (Chitose, Buffeli) isolates of T. orientalis sourced from Australian herds. RESULTS: Using de novo assembled sequences and a single nucleotide variant (SNV) analysis pipeline, we found extensive genetic divergence between the T. orientalis genotypes. A genome-wide phylogeny reconstructed to address continued confusion over nomenclature of this species displayed concordance with prior phylogenetic studies based on the major piroplasm surface protein (MPSP) gene. However, average nucleotide identity (ANI) values revealed that the divergence between isolates is comparable to that observed between other theilerias which represent distinct species. Analysis of SNVs revealed putative recombination between the Chitose and Buffeli genotypes and also between Australian and Japanese Ikeda isolates. Finally, to inform future vaccine studies, dN/dS ratios and surface location predictions were analysed. Six predicted surface protein targets were confirmed to be expressed during the piroplasm phase of the parasite by mass spectrometry. CONCLUSIONS: We used whole genome sequencing to demonstrate that the T. orientalis Ikeda, Chitose and Buffeli variants show substantial genetic divergence. Our data indicates that future researchers could potentially consider disease-associated Ikeda and closely related genotypes as a separate species from non-pathogenic Chitose and Buffeli.


Assuntos
Genoma de Protozoário , Proteínas de Protozoários/genética , Theileria/classificação , Theileria/genética , Theileriose/parasitologia , Sequenciamento Completo do Genoma/métodos , Animais , Austrália/epidemiologia , Bovinos , DNA de Protozoário/genética , Genótipo , Filogenia , Especificidade da Espécie , Theileria/isolamento & purificação , Theileriose/epidemiologia
15.
BMC Evol Biol ; 17(1): 118, 2017 05 25.
Artigo em Inglês | MEDLINE | ID: mdl-28545432

RESUMO

BACKGROUND: Wild birds are the major reservoir hosts for influenza A viruses (AIVs) and have been implicated in the emergence of pandemic events in livestock and human populations. Understanding how AIVs spread within and across continents is therefore critical to the development of successful strategies to manage and reduce the impact of influenza outbreaks. In North America many bird species undergo seasonal migratory movements along a North-South axis, thereby providing opportunities for viruses to spread over long distances. However, the role played by such avian flyways in shaping the genetic structure of AIV populations remains uncertain. RESULTS: To assess the relative contribution of bird migration along flyways to the genetic structure of AIV we performed a large-scale phylogeographic study of viruses sampled in the USA and Canada, involving the analysis of 3805 to 4505 sequences from 36 to 38 geographic localities depending on the gene segment data set. To assist in this we developed a maximum likelihood-based genetic algorithm to explore a wide range of complex spatial models, depicting a more complete picture of the migration network than determined previously. CONCLUSIONS: Based on phylogenies estimated from nucleotide sequence data sets, our results show that AIV migration rates are significantly higher within than between flyways, indicating that the migratory patterns of birds play a key role in viral dispersal. These findings provide valuable insights into the evolution, maintenance and transmission of AIVs, in turn allowing the development of improved programs for surveillance and risk assessment.


Assuntos
Migração Animal , Aves/virologia , Influenza Aviária/virologia , Animais , Animais Selvagens , Canadá/epidemiologia , Surtos de Doenças , Humanos , Vírus da Influenza A/genética , Influenza Aviária/epidemiologia , Funções Verossimilhança , Filogenia , Filogeografia , Estados Unidos/epidemiologia
16.
Genome Res ; 24(12): 2077-89, 2014 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-25273068

RESUMO

Multiple sequence alignments (MSAs) are a prerequisite for a wide variety of evolutionary analyses. Published assessments and benchmark data sets for protein and, to a lesser extent, global nucleotide MSAs are available, but less effort has been made to establish benchmarks in the more general problem of whole-genome alignment (WGA). Using the same model as the successful Assemblathon competitions, we organized a competitive evaluation in which teams submitted their alignments and then assessments were performed collectively after all the submissions were received. Three data sets were used: Two were simulated and based on primate and mammalian phylogenies, and one was comprised of 20 real fly genomes. In total, 35 submissions were assessed, submitted by 10 teams using 12 different alignment pipelines. We found agreement between independent simulation-based and statistical assessments, indicating that there are substantial accuracy differences between contemporary alignment tools. We saw considerable differences in the alignment quality of differently annotated regions and found that few tools aligned the duplications analyzed. We found that many tools worked well at shorter evolutionary distances, but fewer performed competitively at longer distances. We provide all data sets, submissions, and assessment programs for further study and provide, as a resource for future benchmarking, a convenient repository of code and data for reproducing the simulation assessments.


Assuntos
Genoma , Genômica/métodos , Alinhamento de Sequência/métodos , Software , Animais , Biologia Computacional/métodos , Simulação por Computador , Conjuntos de Dados como Assunto , Estudo de Associação Genômica Ampla , Humanos , Mamíferos/genética , Filogenia , Reprodutibilidade dos Testes
17.
PLoS Genet ; 10(11): e1004784, 2014 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-25393412

RESUMO

Organisms across the tree of life use a variety of mechanisms to respond to stress-inducing fluctuations in osmotic conditions. Cellular response mechanisms and phenotypes associated with osmoadaptation also play important roles in bacterial virulence, human health, agricultural production and many other biological systems. To improve understanding of osmoadaptive strategies, we have generated 59 high-quality draft genomes for the haloarchaea (a euryarchaeal clade whose members thrive in hypersaline environments and routinely experience drastic changes in environmental salinity) and analyzed these new genomes in combination with those from 21 previously sequenced haloarchaeal isolates. We propose a generalized model for haloarchaeal management of cytoplasmic osmolarity in response to osmotic shifts, where potassium accumulation and sodium expulsion during osmotic upshock are accomplished via secondary transport using the proton gradient as an energy source, and potassium loss during downshock is via a combination of secondary transport and non-specific ion loss through mechanosensitive channels. We also propose new mechanisms for magnesium and chloride accumulation. We describe the expansion and differentiation of haloarchaeal general transcription factor families, including two novel expansions of the TATA-binding protein family, and discuss their potential for enabling rapid adaptation to environmental fluxes. We challenge a recent high-profile proposal regarding the evolutionary origins of the haloarchaea by showing that inclusion of additional genomes significantly reduces support for a proposed large-scale horizontal gene transfer into the ancestral haloarchaeon from the bacterial domain. The combination of broad (17 genera) and deep (≥5 species in four genera) sampling of a phenotypically unified clade has enabled us to uncover both highly conserved and specialized features of osmoadaptation. Finally, we demonstrate the broad utility of such datasets, for metagenomics, improvements to automated gene annotation and investigations of evolutionary processes.


Assuntos
Adaptação Fisiológica/genética , Archaea/genética , Metagenômica , Proteína de Ligação a TATA-Box/genética , Sequência de Bases , Evolução Molecular , Genoma Arqueal , Humanos , Anotação de Sequência Molecular , Concentração Osmolar , Filogenia , Salinidade
18.
Bioinformatics ; 31(4): 587-9, 2015 Feb 15.
Artigo em Inglês | MEDLINE | ID: mdl-25338718

RESUMO

MOTIVATION: Open-source bacterial genome assembly remains inaccessible to many biologists because of its complexity. Few software solutions exist that are capable of automating all steps in the process of de novo genome assembly from Illumina data. RESULTS: A5-miseq can produce high-quality microbial genome assemblies on a laptop computer without any parameter tuning. A5-miseq does this by automating the process of adapter trimming, quality filtering, error correction, contig and scaffold generation and detection of misassemblies. Unlike the original A5 pipeline, A5-miseq can use long reads from the Illumina MiSeq, use read pairing information during contig generation and includes several improvements to read trimming. Together, these changes result in substantially improved assemblies that recover a more complete set of reference genes than previous methods. AVAILABILITY: A5-miseq is licensed under the GPL open-source license. Source code and precompiled binaries for Mac OS X 10.6+ and Linux 2.6.15+ are available from http://sourceforge.net/projects/ngopt CONTACT: aaron.darling@uts.edu.au SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Algoritmos , Genoma Bacteriano , Genômica/métodos , Análise de Sequência de DNA/métodos , Software , Linguagens de Programação
19.
BMC Microbiol ; 16: 41, 2016 Mar 12.
Artigo em Inglês | MEDLINE | ID: mdl-26971047

RESUMO

BACKGROUND: Clostridium difficile infections (CDI) are a significant health problem to humans and food animals. Clostridial toxins ToxA and ToxB encoded by genes tcdA and tcdB are located on a pathogenicity locus known as the PaLoc and are the major virulence factors of C. difficile. While toxin-negative strains of C. difficile are often isolated from faeces of animals and patients suffering from CDI, they are not considered to play a role in disease. Toxin-negative strains of C. difficile have been used successfully to treat recurring CDI but their propensity to acquire the PaLoc via lateral gene transfer and express clinically relevant levels of toxins has reinforced the need to characterise them genetically. In addition, further studies that examine the pathogenic potential of toxin-negative strains of C. difficile and the frequency by which toxin-negative strains may acquire the PaLoc are needed. RESULTS: We undertook a comparative genomic analysis of five Australian toxin-negative isolates of C. difficile that lack tcdA, tcdB and both binary toxin genes cdtA and cdtB that were recovered from humans and farm animals with symptoms of gastrointestinal disease. Our analyses show that the five C. difficile isolates cluster closely with virulent toxigenic strains of C. difficile belonging to the same sequence type (ST) and have virulence gene profiles akin to those in toxigenic strains. Furthermore, phage acquisition appears to have played a key role in the evolution of C. difficile. CONCLUSIONS: Our results are consistent with the C. difficile global population structure comprising six clades each containing both toxin-positive and toxin-negative strains. Our data also suggests that toxin-negative strains of C. difficile encode a repertoire of putative virulence factors that are similar to those found in toxigenic strains of C. difficile, raising the possibility that acquisition of PaLoc by toxin-negative strains poses a threat to human health. Studies in appropriate animal models are needed to examine the pathogenic potential of toxin-negative strains of C. difficile and to determine the frequency by which toxin-negative strains may acquire the PaLoc.


Assuntos
Clostridioides difficile/genética , Clostridioides difficile/isolamento & purificação , Infecções por Clostridium/microbiologia , Infecções por Clostridium/veterinária , Gastroenteropatias/microbiologia , Gastroenteropatias/veterinária , Doenças dos Cavalos/microbiologia , Doenças dos Suínos/microbiologia , Sequência de Aminoácidos , Animais , Proteínas de Bactérias/química , Proteínas de Bactérias/genética , Toxinas Bacterianas/metabolismo , Clostridioides difficile/classificação , Clostridioides difficile/metabolismo , Cavalos , Humanos , Dados de Sequência Molecular , Filogenia , Alinhamento de Sequência , Suínos
20.
Nature ; 466(7307): 720-6, 2010 Aug 05.
Artigo em Inglês | MEDLINE | ID: mdl-20686567

RESUMO

Sponges are an ancient group of animals that diverged from other metazoans over 600 million years ago. Here we present the draft genome sequence of Amphimedon queenslandica, a demosponge from the Great Barrier Reef, and show that it is remarkably similar to other animal genomes in content, structure and organization. Comparative analysis enabled by the sequencing of the sponge genome reveals genomic events linked to the origin and early evolution of animals, including the appearance, expansion and diversification of pan-metazoan transcription factor, signalling pathway and structural genes. This diverse 'toolkit' of genes correlates with critical aspects of all metazoan body plans, and comprises cell cycle control and growth, development, somatic- and germ-cell specification, cell adhesion, innate immunity and allorecognition. Notably, many of the genes associated with the emergence of animals are also implicated in cancer, which arises from defects in basic processes associated with metazoan multicellularity.


Assuntos
Evolução Molecular , Genoma/genética , Poríferos/genética , Animais , Apoptose/genética , Adesão Celular/genética , Ciclo Celular/genética , Polaridade Celular/genética , Proliferação de Células , Genes/genética , Genômica , Humanos , Imunidade Inata/genética , Modelos Biológicos , Neurônios/metabolismo , Fosfotransferases/química , Fosfotransferases/genética , Filogenia , Poríferos/anatomia & histologia , Poríferos/citologia , Poríferos/imunologia , Análise de Sequência de DNA , Transdução de Sinais/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA