Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 127
Filtrar
1.
J Appl Microbiol ; 134(1)2023 Jan 23.
Artigo em Inglês | MEDLINE | ID: mdl-36626787

RESUMO

Omics research inevitably involves the collection and analysis of big data, which can only be handled by automated approaches. Here we point out that the analysis of big data in the field of genomics dictates certain requirements, such as specialized software, quality control of input data, and simplification for visualization of the results. The latter results in a loss of information, as is exemplified for phylogenetic trees. Clear communication of big data analyses can be enhanced by novel visualization strategies. The interpretation of findings is sometimes hampered when dedicated analytical tools are not fully understood by microbiologists, while the researchers performing these analyses may not have a full overview of the biology of the microbes under study. These issues are illustrated here, using SARS-Cov-2 and Salmonella enterica as zoonotic examples. Whereas in scientific communications jargon should be avoided or explained, nomenclature to group similar organisms and distinguish these from more distant relatives is not only essential, but also influences the interpretation of results. Unfortunately, changes in taxonomically accepted names are now so frequent that they hamper rather than assist research, as is illustrated with difficulties of microbiome studies. Nomenclature to group viral isolates, as is done for SARS-Cov2, is also not without difficulties. Some weaknesses in current omics research stem from poor quality of data or biased databases, and problems can be magnified by machine learning approaches. Moreover, the overall opus of scientific publications can now be considered "big data", as is illustrated by the avalanche of COVID-19-related publications. The peer-review model of scientific publishing is only barely coping with this novel situation, resulting in retractions and the publication of bogus works. The avalanche of scientific publications that originated from the current pandemic can obstruct literature searches, and this will unfortunately continue over time.


Assuntos
COVID-19 , Animais , Humanos , SARS-CoV-2/genética , Filogenia , RNA Viral , Genômica , Zoonoses
2.
Nucleic Acids Res ; 49(2): e7, 2021 01 25.
Artigo em Inglês | MEDLINE | ID: mdl-32710622

RESUMO

Traditional epitranscriptomics relies on capturing a single RNA modification by antibody or chemical treatment, combined with short-read sequencing to identify its transcriptomic location. This approach is labor-intensive and may introduce experimental artifacts. Direct sequencing of native RNA using Oxford Nanopore Technologies (ONT) can allow for directly detecting the RNA base modifications, although these modifications might appear as sequencing errors. The percent Error of Specific Bases (%ESB) was higher for native RNA than unmodified RNA, which enabled the detection of ribonucleotide modification sites. Based on the %ESB differences, we developed a bioinformatic tool, epitranscriptional landscape inferring from glitches of ONT signals (ELIGOS), that is based on various types of synthetic modified RNA and applied to rRNA and mRNA. ELIGOS is able to accurately predict known classes of RNA methylation sites (AUC > 0.93) in rRNAs from Escherichiacoli, yeast, and human cells, using either unmodified in vitro transcription RNA or a background error model, which mimics the systematic error of direct RNA sequencing as the reference. The well-known DRACH/RRACH motif was localized and identified, consistent with previous studies, using differential analysis of ELIGOS to study the impact of RNA m6A methyltransferase by comparing wild type and knockouts in yeast and mouse cells. Lastly, the DRACH motif could also be identified in the mRNA of three human cell lines. The mRNA modification identified by ELIGOS is at the level of individual base resolution. In summary, we have developed a bioinformatic software package to uncover native RNA modifications.


Assuntos
Biologia Computacional/métodos , Sequenciamento de Nucleotídeos em Larga Escala , Processamento Pós-Transcricional do RNA , RNA-Seq , Erro Científico Experimental , Software , Adenina/análogos & derivados , Adenina/análise , Animais , Linhagem Celular , Escherichia coli/genética , Humanos , Meiose , Metiltransferases/deficiência , Metiltransferases/metabolismo , Camundongos , Camundongos Knockout , Motivos de Nucleotídeos , RNA Bacteriano/genética , RNA Fúngico/genética , RNA Mensageiro/genética , RNA Ribossômico/genética , Curva ROC , Saccharomyces cerevisiae/genética , Análise de Sequência de DNA , Moldes Genéticos , Transcrição Gênica
3.
J Appl Microbiol ; 133(6): 3690-3698, 2022 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-36074056

RESUMO

AIMS: The current Monkeypox virus (MPX) outbreak is not only the largest known outbreak to date caused by a strain belonging to the West-African clade, but also results in remarkably different clinical and epidemiological features compared to previous outbreaks of this virus. Here, we consider the possibility that mutations in the viral genome may be responsible for its changed characteristics. METHODS AND RESULTS: Six genome sequences of isolates from the current outbreak were compared to five genomes of isolates from the 2017 outbreak in Nigeria and to two historic genomes, all belonging to the West-African clade. We report differences that are consistently present in the 2022 isolates but not in the others. Although some variation in repeat units was observed, only two were consistently found in the 2022 genomes only, and these were located in intergenic regions. A total of 55 single nucleotide polymorphisms were consistently present in the 2022 isolates compared to the 2017 isolates. Of these, 25 caused an amino acid substitution in a predicted protein. CONCLUSIONS: The nature of the substitution and the annotation of the affected protein identified potential candidates that might affect the virulence of the virus. These included the viral DNA helicase and transcription factors. SIGNIFICANCE: This bioinformatic analysis provides guidance for wet-lab research to identify changed properties of the MPX.


Assuntos
Surtos de Doenças , Monkeypox virus , Monkeypox virus/genética , Nigéria/epidemiologia , Genoma Viral/genética , DNA Viral
4.
Nucleic Acids Res ; 46(7): e38, 2018 04 20.
Artigo em Inglês | MEDLINE | ID: mdl-29346625

RESUMO

Completion of eukaryal genomes can be difficult task with the highly repetitive sequences along the chromosomes and short read lengths of second-generation sequencing. Saccharomyces cerevisiae strain CEN.PK113-7D, widely used as a model organism and a cell factory, was selected for this study to demonstrate the superior capability of very long sequence reads for de novo genome assembly. We generated long reads using two common third-generation sequencing technologies (Oxford Nanopore Technology (ONT) and Pacific Biosciences (PacBio)) and used short reads obtained using Illumina sequencing for error correction. Assembly of the reads derived from all three technologies resulted in complete sequences for all 16 yeast chromosomes, as well as the mitochondrial chromosome, in one step. Further, we identified three types of DNA methylation (5mC, 4mC and 6mA). Comparison between the reference strain S288C and strain CEN.PK113-7D identified chromosomal rearrangements against a background of similar gene content between the two strains. We identified full-length transcripts through ONT direct RNA sequencing technology. This allows for the identification of transcriptional landscapes, including untranslated regions (UTRs) (5' UTR and 3' UTR) as well as differential gene expression quantification. About 91% of the predicted transcripts could be consistently detected across biological replicates grown either on glucose or ethanol. Direct RNA sequencing identified many polyadenylated non-coding RNAs, rRNAs, telomere-RNA, long non-coding RNA and antisense RNA. This work demonstrates a strategy to obtain complete genome sequences and transcriptional landscapes that can be applied to other eukaryal organisms.


Assuntos
Genoma Fúngico/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , RNA Fúngico/genética , Saccharomyces cerevisiae/genética , Regiões 3' não Traduzidas/genética , Regiões 5' não Traduzidas/genética , Metilação de DNA/genética , Genômica , Nanoporos , RNA Longo não Codificante/genética , Sequências Repetitivas de Ácido Nucleico/genética , Análise de Sequência de DNA
5.
Eur Heart J ; 40(14): 1107-1112, 2019 04 07.
Artigo em Inglês | MEDLINE | ID: mdl-30753448

RESUMO

Cardiovascular disease (CVD) rates in adulthood are high in premature infants; unfortunately, the underlying mechanisms are not well defined. In this review, we discuss potential pathways that could lead to CVD in premature babies. Studies show intense oxidant stress and inflammation at tissue levels in these neonates. Alterations in lipid profile, foetal epigenomics, and gut microbiota in these infants may also underlie the development of CVD. Recently, probiotic bacteria, such as the mucin-degrading bacterium Akkermansia muciniphila have been shown to reduce inflammation and prevent heart disease in animal models. All this information might enable scientists and clinicians to target pathways to act early to curtail the adverse effects of prematurity on the cardiovascular system. This could lead to primary and secondary prevention of CVD and improve survival among preterm neonates later in adult life.


Assuntos
Doenças Cardiovasculares/fisiopatologia , Nascimento Prematuro/fisiopatologia , Aterosclerose/fisiopatologia , Citocinas/metabolismo , Dislipidemias/fisiopatologia , Endotélio Vascular/fisiopatologia , Epigênese Genética/fisiologia , Microbioma Gastrointestinal/fisiologia , Humanos , Inflamação/metabolismo , Inflamação/fisiopatologia , Síndrome Metabólica/fisiopatologia , Óxido Nítrico/metabolismo , Estresse Oxidativo/fisiologia , Espécies Reativas de Oxigênio/metabolismo , Sistema Renina-Angiotensina/fisiologia
6.
Molecules ; 25(22)2020 Nov 12.
Artigo em Inglês | MEDLINE | ID: mdl-33198233

RESUMO

The advancements of information technology and related processing techniques have created a fertile base for progress in many scientific fields and industries. In the fields of drug discovery and development, machine learning techniques have been used for the development of novel drug candidates. The methods for designing drug targets and novel drug discovery now routinely combine machine learning and deep learning algorithms to enhance the efficiency, efficacy, and quality of developed outputs. The generation and incorporation of big data, through technologies such as high-throughput screening and high through-put computational analysis of databases used for both lead and target discovery, has increased the reliability of the machine learning and deep learning incorporated techniques. The use of these virtual screening and encompassing online information has also been highlighted in developing lead synthesis pathways. In this review, machine learning and deep learning algorithms utilized in drug discovery and associated techniques will be discussed. The applications that produce promising results and methods will be reviewed.


Assuntos
Biologia Computacional/métodos , Descoberta de Drogas/métodos , Aprendizado de Máquina , Algoritmos , Teorema de Bayes , Bases de Dados Factuais , Aprendizado Profundo , Humanos , Internet , Método de Monte Carlo , Reprodutibilidade dos Testes , Software , Máquina de Vetores de Suporte
7.
BMC Cancer ; 19(1): 827, 2019 Aug 22.
Artigo em Inglês | MEDLINE | ID: mdl-31438887

RESUMO

BACKGROUND: SMARCB1-deficient sinonasal carcinoma (SDSC) is an aggressive subtype of head and neck cancers that has a poor prognosis despite multimodal therapy. We present a unique case with next generation sequencing data of a patient who had SDSC with perineural invasion to the trigeminal nerve that progressed to a brain metastasis and eventually leptomeningeal spread. CASE PRESENTATION: A 42 year old female presented with facial pain and had resection of a tumor along the V2 division of the trigeminal nerve on the right. She underwent adjuvant stereotactic radiation. She developed further neurological symptoms and imaging demonstrated the tumor had infiltrated into the cavernous sinus as well as intradurally. She had surgical resection for removal of her brain metastasis and decompression of the cavernous sinus. Following her second surgery, she had adjuvant radiation and chemotherapy. Several months later she had quadriparesis and imaging was consistent with leptomeningeal spread. She underwent palliative radiation and ultimately transitioned quickly to comfort care and expired. Overall survival from time of diagnosis was 13 months. Next generation sequencing was carried out on her primary tumor and brain metastasis. The brain metastatic tissue had an increased tumor mutational burden in comparison to the primary. CONCLUSIONS: This is the first report of SDSC with perineural invasion progressing to leptomeningeal carcinomatosis. Continued next generation sequencing of the primary and metastatic tissue by clinicians is encouraged toprovide further insights into metastatic progression of rare solid tumors.


Assuntos
Carcinoma/etiologia , Carcinoma/patologia , Neoplasias dos Seios Paranasais/etiologia , Neoplasias dos Seios Paranasais/patologia , Proteína SMARCB1/deficiência , Adulto , Biomarcadores Tumorais , Carcinoma/diagnóstico por imagem , Progressão da Doença , Feminino , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Imuno-Histoquímica , Carcinomatose Meníngea/diagnóstico , Carcinomatose Meníngea/secundário , Metástase Neoplásica , Estadiamento de Neoplasias , Neoplasias dos Seios Paranasais/diagnóstico por imagem , Polimorfismo de Nucleotídeo Único , Tomografia Computadorizada por Raios X
8.
Emerg Infect Dis ; 24(9)2018 09.
Artigo em Inglês | MEDLINE | ID: mdl-29985788

RESUMO

We sequenced the virus genomes from 3 pregnant women in Thailand with Zika virus diagnoses. All had infections with the Asian lineage. The woman infected at gestational week 9, and not those infected at weeks 20 and 24, had a fetus with microcephaly. Asian lineage Zika viruses can cause microcephaly.


Assuntos
Microcefalia/diagnóstico , Complicações Infecciosas na Gravidez , Infecção por Zika virus , Zika virus/isolamento & purificação , Feminino , Humanos , Recém-Nascido , Microcefalia/etiologia , Gravidez , Primeiro Trimestre da Gravidez , Tailândia , Zika virus/genética
9.
Microb Ecol ; 76(3): 801-813, 2018 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-29445826

RESUMO

Infections due to Clostridioides difficile (previously known as Clostridium difficile) are a major problem in hospitals, where cases can be caused by community-acquired strains as well as by nosocomial spread. Whole genome sequences from clinical samples contain a lot of information but that needs to be analyzed and compared in such a way that the outcome is useful for clinicians or epidemiologists. Here, we compare 663 public available complete genome sequences of C. difficile using average amino acid identity (AAI) scores. This analysis revealed that most of these genomes (640, 96.5%) clearly belong to the same species, while the remaining 23 genomes produce four distinct clusters within the Clostridioides genus. The main C. difficile cluster can be further divided into sub-clusters, depending on the chosen cutoff. We demonstrate that MLST, either based on partial or full gene-length, results in biased estimates of genetic differences and does not capture the true degree of similarity or differences of complete genomes. Presence of genes coding for C. difficile toxins A and B (ToxA/B), as well as the binary C. difficile toxin (CDT), was deduced from their unique PfamA domain architectures. Out of the 663 C. difficile genomes, 535 (80.7%) contained at least one copy of ToxA or ToxB, while these genes were missing from 128 genomes. Although some clusters were enriched for toxin presence, these genes are variably present in a given genetic background. The CDT genes were found in 191 genomes, which were restricted to a few clusters only, and only one cluster lacked the toxin A/B genes consistently. A total of 310 genomes contained ToxA/B without CDT (47%). Further, published metagenomic data from stools were used to assess the presence of C. difficile sequences in blinded cases of C. difficile infection (CDI) and controls, to test if metagenomic analysis is sensitive enough to detect the pathogen, and to establish strain relationships between cases from the same hospital. We conclude that metagenomics can contribute to the identification of CDI and can assist in characterization of the most probable causative strain in CDI patients.


Assuntos
Clostridioides difficile/genética , Clostridioides difficile/isolamento & purificação , Genoma Bacteriano , Sequência de Aminoácidos , Proteínas de Bactérias/química , Proteínas de Bactérias/genética , Toxinas Bacterianas/metabolismo , Clostridioides difficile/química , Clostridioides difficile/classificação , Infecções por Clostridium/microbiologia , Dosagem de Genes , Humanos , Dados de Sequência Molecular , Tipagem de Sequências Multilocus , Filogenia , Homologia de Sequência de Aminoácidos
10.
BMC Bioinformatics ; 18(Suppl 14): 471, 2017 12 28.
Artigo em Inglês | MEDLINE | ID: mdl-29297281

RESUMO

BACKGROUND: Zika virus (ZIKV) is an emerging human pathogen. Since its arrival in the Western hemisphere, from Africa via Asia, it has become a serious threat to pregnant women, causing microcephaly and other neuropathies in developing fetuses. The mechanisms behind these teratogenic effects are unknown, although epidemiological evidence suggests that microcephaly is not associated with the original, African lineage of ZIKV. The sequences of 196 published ZIKV genomes were used to assess whether recently proposed mechanistic explanations for microcephaly are supported by molecular level changes that may have increased its virulence since the virus left Africa. For this we performed phylogenetic, recombination, adaptive evolution and tetramer frequency analyses, and compared protein sequences for the presence of protease cleavage sites, Pfam domains, glycosylation sites, signal peptides, trans-membrane protein domains, and phosphorylation sites. RESULTS: Recombination events within or between Asian and Brazilian lineages were not observed, and likewise there were no differences in protease cleavage, glycosylation sites, signal peptides or trans-membrane domains between African and Brazilian strains. The frequency of Retinoic Acid Response Element (RARE) sequences was increased in Brazilian strains. Genetic adaptation was also apparent by tetramer signatures that had undergone major changes in the past but has stabilized in the Brazilian lineage despite subsequent geographic spread, suggesting the viral population presently propagates in the same host species in various regions. Evidence for selection pressure was recognized for several amino acid sites in the Brazilian lineage compared to the African lineage, mainly in nonstructural proteins, especially protein NS4B. A number of these positively selected mutations resulted in an increased potential to be phosphorylated in the Brazilian lineage compared to the African linage, which may have increased their potential to interfere with neural fetal development. CONCLUSIONS: ZIKV seems to have adapted to a limited number of hosts, including humans, during which its virulence increased. Its protein NS4B, together with NS4A, has recently been shown to inhibit Akt-mTOR signaling in human fetal neural stem cells, a key pathway for brain development. We hypothesize that positive selection of novel phosphorylation sites in the protein NS4B of the Brazilian lineage could interfere with phosphorylation of Akt and mTOR, impairing Akt-mTOR signaling and this may result in an increased risk for developmental neuropathies.


Assuntos
Genoma Viral , Microcefalia/virologia , Zika virus/genética , Zika virus/fisiologia , Adaptação Fisiológica/genética , África , Ásia , Sequência de Bases , Brasil , Linhagem Celular , Códon/genética , Feminino , Variação Genética , Interações Hospedeiro-Patógeno/genética , Humanos , Microcefalia/imunologia , Fosforilação , Filogenia , Gravidez , Estabilidade de RNA/genética , Recombinação Genética/genética , Seleção Genética , Virulência/genética , Zika virus/patogenicidade , Infecção por Zika virus/imunologia , Infecção por Zika virus/virologia
11.
Trends Genet ; 29(5): 273-9, 2013 May.
Artigo em Inglês | MEDLINE | ID: mdl-23219343

RESUMO

A central undertaking in synthetic biology (SB) is the quest for the 'minimal genome'. However, 'minimal sets' of essential genes are strongly context-dependent and, in all prokaryotic genomes sequenced to date, not a single protein-coding gene is entirely conserved. Furthermore, a lack of consensus in the field as to what attributes make a gene truly essential adds another aspect of variation. Thus, a universal minimal genome remains elusive. Here, as an alternative to defining a minimal genome, we propose that the concept of gene persistence can be used to classify genes needed for robust long-term survival. Persistent genes, although not ubiquitous, are conserved in a majority of genomes, tend to be expressed at high levels, and are frequently located on the leading DNA strand. These criteria impose constraints on genome organization, and these are important considerations for engineering cells and for creating cellular life-like forms in SB.


Assuntos
Genes Essenciais/genética , Genoma Bacteriano , Biologia Sintética , Evolução Molecular , Genes , Engenharia Genética , Mycoplasma/genética
12.
Appl Environ Microbiol ; 82(1): 375-83, 2016 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-26519390

RESUMO

The Pseudomonas genus contains a metabolically versatile group of organisms that are known to occupy numerous ecological niches, including the rhizosphere and endosphere of many plants. Their diversity influences the phylogenetic diversity and heterogeneity of these communities. On the basis of average amino acid identity, comparative genome analysis of >1,000 Pseudomonas genomes, including 21 Pseudomonas strains isolated from the roots of native Populus deltoides (eastern cottonwood) trees resulted in consistent and robust genomic clusters with phylogenetic homogeneity. All Pseudomonas aeruginosa genomes clustered together, and these were clearly distinct from other Pseudomonas species groups on the basis of pangenome and core genome analyses. In contrast, the genomes of Pseudomonas fluorescens were organized into 20 distinct genomic clusters, representing enormous diversity and heterogeneity. Most of our 21 Populus-associated isolates formed three distinct subgroups within the major P. fluorescens group, supported by pathway profile analysis, while two isolates were more closely related to Pseudomonas chlororaphis and Pseudomonas putida. Genes specific to Populus-associated subgroups were identified. Genes specific to subgroup 1 include several sensory systems that act in two-component signal transduction, a TonB-dependent receptor, and a phosphorelay sensor. Genes specific to subgroup 2 contain hypothetical genes, and genes specific to subgroup 3 were annotated with hydrolase activity. This study justifies the need to sequence multiple isolates, especially from P. fluorescens, which displays the most genetic variation, in order to study functional capabilities from a pangenomic perspective. This information will prove useful when choosing Pseudomonas strains for use to promote growth and increase disease resistance in plants.


Assuntos
Variação Genética , Genoma Bacteriano , Populus/microbiologia , Pseudomonas/classificação , Pseudomonas/genética , Hibridização Genômica Comparativa , Filogenia , Raízes de Plantas/microbiologia , Pseudomonas/isolamento & purificação , Pseudomonas aeruginosa/genética , Pseudomonas aeruginosa/isolamento & purificação , Pseudomonas fluorescens/classificação , Pseudomonas fluorescens/genética , Pseudomonas fluorescens/isolamento & purificação , Pseudomonas putida/genética , Pseudomonas putida/isolamento & purificação , Rizosfera , Análise de Sequência de DNA
13.
Appl Environ Microbiol ; 82(8): 2516-26, 2016 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-26944846

RESUMO

It has been 30 years since the initial emergence and subsequent rapid global spread of multidrug-resistant Salmonella entericaserovar Typhimurium DT104 (MDR DT104). Nonetheless, its origin and transmission route have never been revealed. We used whole-genome sequencing (WGS) and temporally structured sequence analysis within a Bayesian framework to reconstruct temporal and spatial phylogenetic trees and estimate the rates of mutation and divergence times of 315S Typhimurium DT104 isolates sampled from 1969 to 2012 from 21 countries on six continents. DT104 was estimated to have emerged initially as antimicrobial susceptible in ∼1948 (95% credible interval [CI], 1934 to 1962) and later became MDR DT104 in ∼1972 (95% CI, 1972 to 1988) through horizontal transfer of the 13-kb Salmonella genomic island 1 (SGI1) MDR region into susceptible strains already containing SGI1. This was followed by multiple transmission events, initially from central Europe and later between several European countries. An independent transmission to the United States and another to Japan occurred, and from there MDR DT104 was probably transmitted to Taiwan and Canada. An independent acquisition of resistance genes took place in Thailand in ∼1975 (95% CI, 1975 to 1990). In Denmark, WGS analysis provided evidence for transmission of the organism between herds of animals. Interestingly, the demographic history of Danish MDR DT104 provided evidence for the success of the program to eradicate Salmonellafrom pig herds in Denmark from 1996 to 2000. The results from this study refute several hypotheses on the evolution of DT104 and suggest that WGS may be useful in monitoring emerging clones and devising strategies for prevention of Salmonella infections.


Assuntos
Filogeografia , Salmonelose Animal/epidemiologia , Infecções por Salmonella/epidemiologia , Salmonella typhimurium/isolamento & purificação , Animais , Farmacorresistência Bacteriana Múltipla , Evolução Molecular , Genoma Bacteriano , Genótipo , Saúde Global , Humanos , Epidemiologia Molecular , Tipagem Molecular , Polimorfismo de Nucleotídeo Único , Infecções por Salmonella/microbiologia , Infecções por Salmonella/transmissão , Salmonelose Animal/microbiologia , Salmonelose Animal/transmissão , Salmonella typhimurium/classificação , Salmonella typhimurium/genética , Análise de Sequência de DNA , Análise Espaço-Temporal , Suínos , Doenças dos Suínos/epidemiologia , Doenças dos Suínos/microbiologia , Doenças dos Suínos/transmissão
14.
Infect Immun ; 83(5): 1749-64, 2015 May.
Artigo em Inglês | MEDLINE | ID: mdl-25667270

RESUMO

Urinary tract infections (UTIs) are among the most common infectious diseases of humans, with Escherichia coli responsible for >80% of all cases. One extreme of UTI is asymptomatic bacteriuria (ABU), which occurs as an asymptomatic carrier state that resembles commensalism. To understand the evolution and molecular mechanisms that underpin ABU, the genome of the ABU E. coli strain VR50 was sequenced. Analysis of the complete genome indicated that it most resembles E. coli K-12, with the addition of a 94-kb genomic island (GI-VR50-pheV), eight prophages, and multiple plasmids. GI-VR50-pheV has a mosaic structure and contains genes encoding a number of UTI-associated virulence factors, namely, Afa (afimbrial adhesin), two autotransporter proteins (Ag43 and Sat), and aerobactin. We demonstrated that the presence of this island in VR50 confers its ability to colonize the murine bladder, as a VR50 mutant with GI-VR50-pheV deleted was attenuated in a mouse model of UTI in vivo. We established that Afa is the island-encoded factor responsible for this phenotype using two independent deletion (Afa operon and AfaE adhesin) mutants. E. coli VR50afa and VR50afaE displayed significantly decreased ability to adhere to human bladder epithelial cells. In the mouse model of UTI, VR50afa and VR50afaE displayed reduced bladder colonization compared to wild-type VR50, similar to the colonization level of the GI-VR50-pheV mutant. Our study suggests that E. coli VR50 is a commensal-like strain that has acquired fitness factors that facilitate colonization of the human bladder.


Assuntos
Adaptação Biológica , Bacteriúria/microbiologia , Portador Sadio/microbiologia , Infecções por Escherichia coli/microbiologia , Escherichia coli/genética , Evolução Molecular , Sistema Urinário/microbiologia , Adulto , Animais , Aderência Bacteriana , Linhagem Celular , DNA Bacteriano/química , DNA Bacteriano/genética , Células Epiteliais/microbiologia , Escherichia coli/isolamento & purificação , Feminino , Genoma Bacteriano , Humanos , Camundongos Endogâmicos C57BL , Modelos Animais , Dados de Sequência Molecular , Análise de Sequência de DNA
15.
Funct Integr Genomics ; 15(2): 141-61, 2015 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-25722247

RESUMO

Since the first two complete bacterial genome sequences were published in 1995, the science of bacteria has dramatically changed. Using third-generation DNA sequencing, it is possible to completely sequence a bacterial genome in a few hours and identify some types of methylation sites along the genome as well. Sequencing of bacterial genome sequences is now a standard procedure, and the information from tens of thousands of bacterial genomes has had a major impact on our views of the bacterial world. In this review, we explore a series of questions to highlight some insights that comparative genomics has produced. To date, there are genome sequences available from 50 different bacterial phyla and 11 different archaeal phyla. However, the distribution is quite skewed towards a few phyla that contain model organisms. But the breadth is continuing to improve, with projects dedicated to filling in less characterized taxonomic groups. The clustered regularly interspaced short palindromic repeats (CRISPR)-Cas system provides bacteria with immunity against viruses, which outnumber bacteria by tenfold. How fast can we go? Second-generation sequencing has produced a large number of draft genomes (close to 90 % of bacterial genomes in GenBank are currently not complete); third-generation sequencing can potentially produce a finished genome in a few hours, and at the same time provide methlylation sites along the entire chromosome. The diversity of bacterial communities is extensive as is evident from the genome sequences available from 50 different bacterial phyla and 11 different archaeal phyla. Genome sequencing can help in classifying an organism, and in the case where multiple genomes of the same species are available, it is possible to calculate the pan- and core genomes; comparison of more than 2000 Escherichia coli genomes finds an E. coli core genome of about 3100 gene families and a total of about 89,000 different gene families. Why do we care about bacterial genome sequencing? There are many practical applications, such as genome-scale metabolic modeling, biosurveillance, bioforensics, and infectious disease epidemiology. In the near future, high-throughput sequencing of patient metagenomic samples could revolutionize medicine in terms of speed and accuracy of finding pathogens and knowing how to treat them.


Assuntos
Genoma Bacteriano , Bactérias/classificação , Proteínas de Bactérias/genética , Códon , Variação Genética , Tamanho do Genoma , Genômica , Metagenômica , Anotação de Sequência Molecular , Filogenia , Análise de Sequência de DNA
16.
Arch Microbiol ; 197(3): 359-70, 2015 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-25533848

RESUMO

Microbial taxonomy should provide adequate descriptions of bacterial, archaeal, and eukaryotic microbial diversity in ecological, clinical, and industrial environments. Its cornerstone, the prokaryote species has been re-evaluated twice. It is time to revisit polyphasic taxonomy, its principles, and its practice, including its underlying pragmatic species concept. Ultimately, we will be able to realize an old dream of our predecessor taxonomists and build a genomic-based microbial taxonomy, using standardized and automated curation of high-quality complete genome sequences as the new gold standard.


Assuntos
Archaea/classificação , Archaea/genética , Bactérias/classificação , Bactérias/genética , Classificação/métodos , Genômica , Microbiologia/tendências , Simulação por Computador
17.
Proc Natl Acad Sci U S A ; 109(20): E1277-86, 2012 May 15.
Artigo em Inglês | MEDLINE | ID: mdl-22538806

RESUMO

More than 50 y of research have provided great insight into the physiology, metabolism, and molecular biology of Salmonella enterica serovar Typhimurium (S. Typhimurium), but important gaps in our knowledge remain. It is clear that a precise choreography of gene expression is required for Salmonella infection, but basic genetic information such as the global locations of transcription start sites (TSSs) has been lacking. We combined three RNA-sequencing techniques and two sequencing platforms to generate a robust picture of transcription in S. Typhimurium. Differential RNA sequencing identified 1,873 TSSs on the chromosome of S. Typhimurium SL1344 and 13% of these TSSs initiated antisense transcripts. Unique findings include the TSSs of the virulence regulators phoP, slyA, and invF. Chromatin immunoprecipitation revealed that RNA polymerase was bound to 70% of the TSSs, and two-thirds of these TSSs were associated with σ(70) (including phoP, slyA, and invF) from which we identified the -10 and -35 motifs of σ(70)-dependent S. Typhimurium gene promoters. Overall, we corrected the location of important genes and discovered 18 times more promoters than identified previously. S. Typhimurium expresses 140 small regulatory RNAs (sRNAs) at early stationary phase, including 60 newly identified sRNAs. Almost half of the experimentally verified sRNAs were found to be unique to the Salmonella genus, and <20% were found throughout the Enterobacteriaceae. This description of the transcriptional map of SL1344 advances our understanding of S. Typhimurium, arguably the most important bacterial infection model.


Assuntos
Regulação Bacteriana da Expressão Gênica/genética , Pequeno RNA não Traduzido/genética , Sequências Reguladoras de Ácido Ribonucleico/genética , Salmonella typhimurium/genética , Transcrição Gênica/genética , Sequência de Bases , Northern Blotting , Imunoprecipitação da Cromatina , Biblioteca Gênica , Análise em Microsséries , Dados de Sequência Molecular , Oligonucleotídeos/genética , Regiões Promotoras Genéticas/genética , Análise de Sequência de RNA/métodos , Sítio de Iniciação de Transcrição
18.
J Clin Microbiol ; 52(5): 1529-39, 2014 May.
Artigo em Inglês | MEDLINE | ID: mdl-24574292

RESUMO

One of the first issues that emerges when a prokaryotic organism of interest is encountered is the question of what it is--that is, which species it is. The 16S rRNA gene formed the basis of the first method for sequence-based taxonomy and has had a tremendous impact on the field of microbiology. Nevertheless, the method has been found to have a number of shortcomings. In the current study, we trained and benchmarked five methods for whole-genome sequence-based prokaryotic species identification on a common data set of complete genomes: (i) SpeciesFinder, which is based on the complete 16S rRNA gene; (ii) Reads2Type that searches for species-specific 50-mers in either the 16S rRNA gene or the gyrB gene (for the Enterobacteraceae family); (iii) the ribosomal multilocus sequence typing (rMLST) method that samples up to 53 ribosomal genes; (iv) TaxonomyFinder, which is based on species-specific functional protein domain profiles; and finally (v) KmerFinder, which examines the number of cooccurring k-mers (substrings of k nucleotides in DNA sequence data). The performances of the methods were subsequently evaluated on three data sets of short sequence reads or draft genomes from public databases. In total, the evaluation sets constituted sequence data from more than 11,000 isolates covering 159 genera and 243 species. Our results indicate that methods that sample only chromosomal, core genes have difficulties in distinguishing closely related species which only recently diverged. The KmerFinder method had the overall highest accuracy and correctly identified from 93% to 97% of the isolates in the evaluations sets.


Assuntos
Benchmarking/métodos , Classificação/métodos , Genômica/métodos , Archaea/genética , Bactérias/genética , Proteínas de Bactérias/genética , DNA Bacteriano/genética , Tipagem de Sequências Multilocus/métodos , RNA Ribossômico 16S/genética
19.
PLoS Biol ; 9(6): e1001088, 2011 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-21713030

RESUMO

A vast and rich body of information has grown up as a result of the world's enthusiasm for 'omics technologies. Finding ways to describe and make available this information that maximise its usefulness has become a major effort across the 'omics world. At the heart of this effort is the Genomic Standards Consortium (GSC), an open-membership organization that drives community-based standardization activities, Here we provide a short history of the GSC, provide an overview of its range of current activities, and make a call for the scientific community to join forces to improve the quality and quantity of contextual information about our public collections of genomes, metagenomes, and marker gene sequences.


Assuntos
Bases de Dados Genéticas , Genômica/normas , Cooperação Internacional , Metagenoma
20.
Front Microbiol ; 15: 1272972, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38440140

RESUMO

Introduction: Whole Genome Sequencing (WGS) of the SARS-CoV-2 virus is crucial in the surveillance of the COVID-19 pandemic. Several primer schemes have been developed to sequence nearly all of the ~30,000 nucleotide SARS-CoV-2 genome, using a multiplex PCR approach to amplify cDNA copies of the viral genomic RNA. Midnight primers and ARTIC V4.1 primers are the most popular primer schemes that can amplify segments of SARS-CoV-2 (400 bp and 1200 bp, respectively) tiled across the viral RNA genome. Mutations within primer binding sites and primer-primer interactions can result in amplicon dropouts and coverage bias, yielding low-quality genomes with 'Ns' inserted in the missing amplicon regions, causing inaccurate lineage assignments, and making it challenging to monitor lineage-specific mutations in Variants of Concern (VoCs). Methods: In this study we used a set of seven long-range PCR primer pairs to sequence clinical isolates of SARS-CoV-2 on Oxford Nanopore sequencer. These long-range primers generate seven amplicons approximately 4500 bp that covered whole genome of SARS-CoV-2. One of these regions includes the full-length S-gene by using a set of flanking primers. We also evaluated the performance of these long-range primers with Midnight primers by sequencing 94 clinical isolates in a Nanopore flow cell. Results and discussion: Using a small set of long-range primers to sequence SARS-CoV-2 genomes reduces the possibility of amplicon dropout and coverage bias. The key finding of this study is that long range primers can be used in single-molecule sequencing of RNA viruses in surveillance of emerging variants. We also show that by designing primers flanking the S-gene, we can obtain reliable identification of SARS-CoV-2 variants.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA