Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 129
Filtrar
Más filtros

Tipo del documento
Intervalo de año de publicación
1.
J Appl Microbiol ; 134(1)2023 Jan 23.
Artículo en Inglés | MEDLINE | ID: mdl-36626787

RESUMEN

Omics research inevitably involves the collection and analysis of big data, which can only be handled by automated approaches. Here we point out that the analysis of big data in the field of genomics dictates certain requirements, such as specialized software, quality control of input data, and simplification for visualization of the results. The latter results in a loss of information, as is exemplified for phylogenetic trees. Clear communication of big data analyses can be enhanced by novel visualization strategies. The interpretation of findings is sometimes hampered when dedicated analytical tools are not fully understood by microbiologists, while the researchers performing these analyses may not have a full overview of the biology of the microbes under study. These issues are illustrated here, using SARS-Cov-2 and Salmonella enterica as zoonotic examples. Whereas in scientific communications jargon should be avoided or explained, nomenclature to group similar organisms and distinguish these from more distant relatives is not only essential, but also influences the interpretation of results. Unfortunately, changes in taxonomically accepted names are now so frequent that they hamper rather than assist research, as is illustrated with difficulties of microbiome studies. Nomenclature to group viral isolates, as is done for SARS-Cov2, is also not without difficulties. Some weaknesses in current omics research stem from poor quality of data or biased databases, and problems can be magnified by machine learning approaches. Moreover, the overall opus of scientific publications can now be considered "big data", as is illustrated by the avalanche of COVID-19-related publications. The peer-review model of scientific publishing is only barely coping with this novel situation, resulting in retractions and the publication of bogus works. The avalanche of scientific publications that originated from the current pandemic can obstruct literature searches, and this will unfortunately continue over time.


Asunto(s)
COVID-19 , Animales , Humanos , SARS-CoV-2/genética , Filogenia , ARN Viral , Genómica , Zoonosis
2.
Nucleic Acids Res ; 49(2): e7, 2021 01 25.
Artículo en Inglés | MEDLINE | ID: mdl-32710622

RESUMEN

Traditional epitranscriptomics relies on capturing a single RNA modification by antibody or chemical treatment, combined with short-read sequencing to identify its transcriptomic location. This approach is labor-intensive and may introduce experimental artifacts. Direct sequencing of native RNA using Oxford Nanopore Technologies (ONT) can allow for directly detecting the RNA base modifications, although these modifications might appear as sequencing errors. The percent Error of Specific Bases (%ESB) was higher for native RNA than unmodified RNA, which enabled the detection of ribonucleotide modification sites. Based on the %ESB differences, we developed a bioinformatic tool, epitranscriptional landscape inferring from glitches of ONT signals (ELIGOS), that is based on various types of synthetic modified RNA and applied to rRNA and mRNA. ELIGOS is able to accurately predict known classes of RNA methylation sites (AUC > 0.93) in rRNAs from Escherichiacoli, yeast, and human cells, using either unmodified in vitro transcription RNA or a background error model, which mimics the systematic error of direct RNA sequencing as the reference. The well-known DRACH/RRACH motif was localized and identified, consistent with previous studies, using differential analysis of ELIGOS to study the impact of RNA m6A methyltransferase by comparing wild type and knockouts in yeast and mouse cells. Lastly, the DRACH motif could also be identified in the mRNA of three human cell lines. The mRNA modification identified by ELIGOS is at the level of individual base resolution. In summary, we have developed a bioinformatic software package to uncover native RNA modifications.


Asunto(s)
Biología Computacional/métodos , Secuenciación de Nucleótidos de Alto Rendimiento , Procesamiento Postranscripcional del ARN , RNA-Seq , Error Científico Experimental , Programas Informáticos , Adenina/análogos & derivados , Adenina/análisis , Animales , Línea Celular , Escherichia coli/genética , Humanos , Meiosis , Metiltransferasas/deficiencia , Metiltransferasas/metabolismo , Ratones , Ratones Noqueados , Motivos de Nucleótidos , ARN Bacteriano/genética , ARN de Hongos/genética , ARN Mensajero/genética , ARN Ribosómico/genética , Curva ROC , Saccharomyces cerevisiae/genética , Análisis de Secuencia de ADN , Moldes Genéticos , Transcripción Genética
3.
J Appl Microbiol ; 133(6): 3690-3698, 2022 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-36074056

RESUMEN

AIMS: The current Monkeypox virus (MPX) outbreak is not only the largest known outbreak to date caused by a strain belonging to the West-African clade, but also results in remarkably different clinical and epidemiological features compared to previous outbreaks of this virus. Here, we consider the possibility that mutations in the viral genome may be responsible for its changed characteristics. METHODS AND RESULTS: Six genome sequences of isolates from the current outbreak were compared to five genomes of isolates from the 2017 outbreak in Nigeria and to two historic genomes, all belonging to the West-African clade. We report differences that are consistently present in the 2022 isolates but not in the others. Although some variation in repeat units was observed, only two were consistently found in the 2022 genomes only, and these were located in intergenic regions. A total of 55 single nucleotide polymorphisms were consistently present in the 2022 isolates compared to the 2017 isolates. Of these, 25 caused an amino acid substitution in a predicted protein. CONCLUSIONS: The nature of the substitution and the annotation of the affected protein identified potential candidates that might affect the virulence of the virus. These included the viral DNA helicase and transcription factors. SIGNIFICANCE: This bioinformatic analysis provides guidance for wet-lab research to identify changed properties of the MPX.


Asunto(s)
Brotes de Enfermedades , Monkeypox virus , Monkeypox virus/genética , Nigeria/epidemiología , Genoma Viral/genética , ADN Viral
4.
Nucleic Acids Res ; 46(7): e38, 2018 04 20.
Artículo en Inglés | MEDLINE | ID: mdl-29346625

RESUMEN

Completion of eukaryal genomes can be difficult task with the highly repetitive sequences along the chromosomes and short read lengths of second-generation sequencing. Saccharomyces cerevisiae strain CEN.PK113-7D, widely used as a model organism and a cell factory, was selected for this study to demonstrate the superior capability of very long sequence reads for de novo genome assembly. We generated long reads using two common third-generation sequencing technologies (Oxford Nanopore Technology (ONT) and Pacific Biosciences (PacBio)) and used short reads obtained using Illumina sequencing for error correction. Assembly of the reads derived from all three technologies resulted in complete sequences for all 16 yeast chromosomes, as well as the mitochondrial chromosome, in one step. Further, we identified three types of DNA methylation (5mC, 4mC and 6mA). Comparison between the reference strain S288C and strain CEN.PK113-7D identified chromosomal rearrangements against a background of similar gene content between the two strains. We identified full-length transcripts through ONT direct RNA sequencing technology. This allows for the identification of transcriptional landscapes, including untranslated regions (UTRs) (5' UTR and 3' UTR) as well as differential gene expression quantification. About 91% of the predicted transcripts could be consistently detected across biological replicates grown either on glucose or ethanol. Direct RNA sequencing identified many polyadenylated non-coding RNAs, rRNAs, telomere-RNA, long non-coding RNA and antisense RNA. This work demonstrates a strategy to obtain complete genome sequences and transcriptional landscapes that can be applied to other eukaryal organisms.


Asunto(s)
Genoma Fúngico/genética , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , ARN de Hongos/genética , Saccharomyces cerevisiae/genética , Regiones no Traducidas 3'/genética , Regiones no Traducidas 5'/genética , Metilación de ADN/genética , Genómica , Nanoporos , ARN Largo no Codificante/genética , Secuencias Repetitivas de Ácidos Nucleicos/genética , Análisis de Secuencia de ADN
5.
Eur Heart J ; 40(14): 1107-1112, 2019 04 07.
Artículo en Inglés | MEDLINE | ID: mdl-30753448

RESUMEN

Cardiovascular disease (CVD) rates in adulthood are high in premature infants; unfortunately, the underlying mechanisms are not well defined. In this review, we discuss potential pathways that could lead to CVD in premature babies. Studies show intense oxidant stress and inflammation at tissue levels in these neonates. Alterations in lipid profile, foetal epigenomics, and gut microbiota in these infants may also underlie the development of CVD. Recently, probiotic bacteria, such as the mucin-degrading bacterium Akkermansia muciniphila have been shown to reduce inflammation and prevent heart disease in animal models. All this information might enable scientists and clinicians to target pathways to act early to curtail the adverse effects of prematurity on the cardiovascular system. This could lead to primary and secondary prevention of CVD and improve survival among preterm neonates later in adult life.


Asunto(s)
Enfermedades Cardiovasculares/fisiopatología , Nacimiento Prematuro/fisiopatología , Aterosclerosis/fisiopatología , Citocinas/metabolismo , Dislipidemias/fisiopatología , Endotelio Vascular/fisiopatología , Epigénesis Genética/fisiología , Microbioma Gastrointestinal/fisiología , Humanos , Inflamación/metabolismo , Inflamación/fisiopatología , Síndrome Metabólico/fisiopatología , Óxido Nítrico/metabolismo , Estrés Oxidativo/fisiología , Especies Reactivas de Oxígeno/metabolismo , Sistema Renina-Angiotensina/fisiología
6.
Molecules ; 25(22)2020 Nov 12.
Artículo en Inglés | MEDLINE | ID: mdl-33198233

RESUMEN

The advancements of information technology and related processing techniques have created a fertile base for progress in many scientific fields and industries. In the fields of drug discovery and development, machine learning techniques have been used for the development of novel drug candidates. The methods for designing drug targets and novel drug discovery now routinely combine machine learning and deep learning algorithms to enhance the efficiency, efficacy, and quality of developed outputs. The generation and incorporation of big data, through technologies such as high-throughput screening and high through-put computational analysis of databases used for both lead and target discovery, has increased the reliability of the machine learning and deep learning incorporated techniques. The use of these virtual screening and encompassing online information has also been highlighted in developing lead synthesis pathways. In this review, machine learning and deep learning algorithms utilized in drug discovery and associated techniques will be discussed. The applications that produce promising results and methods will be reviewed.


Asunto(s)
Biología Computacional/métodos , Descubrimiento de Drogas/métodos , Aprendizaje Automático , Algoritmos , Teorema de Bayes , Bases de Datos Factuales , Aprendizaje Profundo , Humanos , Internet , Método de Montecarlo , Reproducibilidad de los Resultados , Programas Informáticos , Máquina de Vectores de Soporte
7.
BMC Cancer ; 19(1): 827, 2019 Aug 22.
Artículo en Inglés | MEDLINE | ID: mdl-31438887

RESUMEN

BACKGROUND: SMARCB1-deficient sinonasal carcinoma (SDSC) is an aggressive subtype of head and neck cancers that has a poor prognosis despite multimodal therapy. We present a unique case with next generation sequencing data of a patient who had SDSC with perineural invasion to the trigeminal nerve that progressed to a brain metastasis and eventually leptomeningeal spread. CASE PRESENTATION: A 42 year old female presented with facial pain and had resection of a tumor along the V2 division of the trigeminal nerve on the right. She underwent adjuvant stereotactic radiation. She developed further neurological symptoms and imaging demonstrated the tumor had infiltrated into the cavernous sinus as well as intradurally. She had surgical resection for removal of her brain metastasis and decompression of the cavernous sinus. Following her second surgery, she had adjuvant radiation and chemotherapy. Several months later she had quadriparesis and imaging was consistent with leptomeningeal spread. She underwent palliative radiation and ultimately transitioned quickly to comfort care and expired. Overall survival from time of diagnosis was 13 months. Next generation sequencing was carried out on her primary tumor and brain metastasis. The brain metastatic tissue had an increased tumor mutational burden in comparison to the primary. CONCLUSIONS: This is the first report of SDSC with perineural invasion progressing to leptomeningeal carcinomatosis. Continued next generation sequencing of the primary and metastatic tissue by clinicians is encouraged toprovide further insights into metastatic progression of rare solid tumors.


Asunto(s)
Carcinoma/etiología , Carcinoma/patología , Neoplasias de los Senos Paranasales/etiología , Neoplasias de los Senos Paranasales/patología , Proteína SMARCB1/deficiencia , Adulto , Biomarcadores de Tumor , Carcinoma/diagnóstico por imagen , Progresión de la Enfermedad , Femenino , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Inmunohistoquímica , Carcinomatosis Meníngea/diagnóstico , Carcinomatosis Meníngea/secundario , Metástasis de la Neoplasia , Estadificación de Neoplasias , Neoplasias de los Senos Paranasales/diagnóstico por imagen , Polimorfismo de Nucleótido Simple , Tomografía Computarizada por Rayos X
8.
Emerg Infect Dis ; 24(9)2018 09.
Artículo en Inglés | MEDLINE | ID: mdl-29985788

RESUMEN

We sequenced the virus genomes from 3 pregnant women in Thailand with Zika virus diagnoses. All had infections with the Asian lineage. The woman infected at gestational week 9, and not those infected at weeks 20 and 24, had a fetus with microcephaly. Asian lineage Zika viruses can cause microcephaly.


Asunto(s)
Microcefalia/diagnóstico , Complicaciones Infecciosas del Embarazo , Infección por el Virus Zika , Virus Zika/aislamiento & purificación , Femenino , Humanos , Recién Nacido , Microcefalia/etiología , Embarazo , Primer Trimestre del Embarazo , Tailandia , Virus Zika/genética
9.
Microb Ecol ; 76(3): 801-813, 2018 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-29445826

RESUMEN

Infections due to Clostridioides difficile (previously known as Clostridium difficile) are a major problem in hospitals, where cases can be caused by community-acquired strains as well as by nosocomial spread. Whole genome sequences from clinical samples contain a lot of information but that needs to be analyzed and compared in such a way that the outcome is useful for clinicians or epidemiologists. Here, we compare 663 public available complete genome sequences of C. difficile using average amino acid identity (AAI) scores. This analysis revealed that most of these genomes (640, 96.5%) clearly belong to the same species, while the remaining 23 genomes produce four distinct clusters within the Clostridioides genus. The main C. difficile cluster can be further divided into sub-clusters, depending on the chosen cutoff. We demonstrate that MLST, either based on partial or full gene-length, results in biased estimates of genetic differences and does not capture the true degree of similarity or differences of complete genomes. Presence of genes coding for C. difficile toxins A and B (ToxA/B), as well as the binary C. difficile toxin (CDT), was deduced from their unique PfamA domain architectures. Out of the 663 C. difficile genomes, 535 (80.7%) contained at least one copy of ToxA or ToxB, while these genes were missing from 128 genomes. Although some clusters were enriched for toxin presence, these genes are variably present in a given genetic background. The CDT genes were found in 191 genomes, which were restricted to a few clusters only, and only one cluster lacked the toxin A/B genes consistently. A total of 310 genomes contained ToxA/B without CDT (47%). Further, published metagenomic data from stools were used to assess the presence of C. difficile sequences in blinded cases of C. difficile infection (CDI) and controls, to test if metagenomic analysis is sensitive enough to detect the pathogen, and to establish strain relationships between cases from the same hospital. We conclude that metagenomics can contribute to the identification of CDI and can assist in characterization of the most probable causative strain in CDI patients.


Asunto(s)
Clostridioides difficile/genética , Clostridioides difficile/aislamiento & purificación , Genoma Bacteriano , Secuencia de Aminoácidos , Proteínas Bacterianas/química , Proteínas Bacterianas/genética , Toxinas Bacterianas/metabolismo , Clostridioides difficile/química , Clostridioides difficile/clasificación , Infecciones por Clostridium/microbiología , Dosificación de Gen , Humanos , Datos de Secuencia Molecular , Tipificación de Secuencias Multilocus , Filogenia , Homología de Secuencia de Aminoácido
10.
BMC Bioinformatics ; 18(Suppl 14): 471, 2017 12 28.
Artículo en Inglés | MEDLINE | ID: mdl-29297281

RESUMEN

BACKGROUND: Zika virus (ZIKV) is an emerging human pathogen. Since its arrival in the Western hemisphere, from Africa via Asia, it has become a serious threat to pregnant women, causing microcephaly and other neuropathies in developing fetuses. The mechanisms behind these teratogenic effects are unknown, although epidemiological evidence suggests that microcephaly is not associated with the original, African lineage of ZIKV. The sequences of 196 published ZIKV genomes were used to assess whether recently proposed mechanistic explanations for microcephaly are supported by molecular level changes that may have increased its virulence since the virus left Africa. For this we performed phylogenetic, recombination, adaptive evolution and tetramer frequency analyses, and compared protein sequences for the presence of protease cleavage sites, Pfam domains, glycosylation sites, signal peptides, trans-membrane protein domains, and phosphorylation sites. RESULTS: Recombination events within or between Asian and Brazilian lineages were not observed, and likewise there were no differences in protease cleavage, glycosylation sites, signal peptides or trans-membrane domains between African and Brazilian strains. The frequency of Retinoic Acid Response Element (RARE) sequences was increased in Brazilian strains. Genetic adaptation was also apparent by tetramer signatures that had undergone major changes in the past but has stabilized in the Brazilian lineage despite subsequent geographic spread, suggesting the viral population presently propagates in the same host species in various regions. Evidence for selection pressure was recognized for several amino acid sites in the Brazilian lineage compared to the African lineage, mainly in nonstructural proteins, especially protein NS4B. A number of these positively selected mutations resulted in an increased potential to be phosphorylated in the Brazilian lineage compared to the African linage, which may have increased their potential to interfere with neural fetal development. CONCLUSIONS: ZIKV seems to have adapted to a limited number of hosts, including humans, during which its virulence increased. Its protein NS4B, together with NS4A, has recently been shown to inhibit Akt-mTOR signaling in human fetal neural stem cells, a key pathway for brain development. We hypothesize that positive selection of novel phosphorylation sites in the protein NS4B of the Brazilian lineage could interfere with phosphorylation of Akt and mTOR, impairing Akt-mTOR signaling and this may result in an increased risk for developmental neuropathies.


Asunto(s)
Genoma Viral , Microcefalia/virología , Virus Zika/genética , Virus Zika/fisiología , Adaptación Fisiológica/genética , África , Asia , Secuencia de Bases , Brasil , Línea Celular , Codón/genética , Femenino , Variación Genética , Interacciones Huésped-Patógeno/genética , Humanos , Microcefalia/inmunología , Fosforilación , Filogenia , Embarazo , Estabilidad del ARN/genética , Recombinación Genética/genética , Selección Genética , Virulencia/genética , Virus Zika/patogenicidad , Infección por el Virus Zika/inmunología , Infección por el Virus Zika/virología
11.
Trends Genet ; 29(5): 273-9, 2013 May.
Artículo en Inglés | MEDLINE | ID: mdl-23219343

RESUMEN

A central undertaking in synthetic biology (SB) is the quest for the 'minimal genome'. However, 'minimal sets' of essential genes are strongly context-dependent and, in all prokaryotic genomes sequenced to date, not a single protein-coding gene is entirely conserved. Furthermore, a lack of consensus in the field as to what attributes make a gene truly essential adds another aspect of variation. Thus, a universal minimal genome remains elusive. Here, as an alternative to defining a minimal genome, we propose that the concept of gene persistence can be used to classify genes needed for robust long-term survival. Persistent genes, although not ubiquitous, are conserved in a majority of genomes, tend to be expressed at high levels, and are frequently located on the leading DNA strand. These criteria impose constraints on genome organization, and these are important considerations for engineering cells and for creating cellular life-like forms in SB.


Asunto(s)
Genes Esenciales/genética , Genoma Bacteriano , Biología Sintética , Evolución Molecular , Genes , Ingeniería Genética , Mycoplasma/genética
12.
Appl Environ Microbiol ; 82(8): 2516-26, 2016 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-26944846

RESUMEN

It has been 30 years since the initial emergence and subsequent rapid global spread of multidrug-resistant Salmonella entericaserovar Typhimurium DT104 (MDR DT104). Nonetheless, its origin and transmission route have never been revealed. We used whole-genome sequencing (WGS) and temporally structured sequence analysis within a Bayesian framework to reconstruct temporal and spatial phylogenetic trees and estimate the rates of mutation and divergence times of 315S Typhimurium DT104 isolates sampled from 1969 to 2012 from 21 countries on six continents. DT104 was estimated to have emerged initially as antimicrobial susceptible in ∼1948 (95% credible interval [CI], 1934 to 1962) and later became MDR DT104 in ∼1972 (95% CI, 1972 to 1988) through horizontal transfer of the 13-kb Salmonella genomic island 1 (SGI1) MDR region into susceptible strains already containing SGI1. This was followed by multiple transmission events, initially from central Europe and later between several European countries. An independent transmission to the United States and another to Japan occurred, and from there MDR DT104 was probably transmitted to Taiwan and Canada. An independent acquisition of resistance genes took place in Thailand in ∼1975 (95% CI, 1975 to 1990). In Denmark, WGS analysis provided evidence for transmission of the organism between herds of animals. Interestingly, the demographic history of Danish MDR DT104 provided evidence for the success of the program to eradicate Salmonellafrom pig herds in Denmark from 1996 to 2000. The results from this study refute several hypotheses on the evolution of DT104 and suggest that WGS may be useful in monitoring emerging clones and devising strategies for prevention of Salmonella infections.


Asunto(s)
Filogeografía , Salmonelosis Animal/epidemiología , Infecciones por Salmonella/epidemiología , Salmonella typhimurium/aislamiento & purificación , Animales , Farmacorresistencia Bacteriana Múltiple , Evolución Molecular , Genoma Bacteriano , Genotipo , Salud Global , Humanos , Epidemiología Molecular , Tipificación Molecular , Polimorfismo de Nucleótido Simple , Infecciones por Salmonella/microbiología , Infecciones por Salmonella/transmisión , Salmonelosis Animal/microbiología , Salmonelosis Animal/transmisión , Salmonella typhimurium/clasificación , Salmonella typhimurium/genética , Análisis de Secuencia de ADN , Análisis Espacio-Temporal , Porcinos , Enfermedades de los Porcinos/epidemiología , Enfermedades de los Porcinos/microbiología , Enfermedades de los Porcinos/transmisión
13.
Appl Environ Microbiol ; 82(1): 375-83, 2016 01 01.
Artículo en Inglés | MEDLINE | ID: mdl-26519390

RESUMEN

The Pseudomonas genus contains a metabolically versatile group of organisms that are known to occupy numerous ecological niches, including the rhizosphere and endosphere of many plants. Their diversity influences the phylogenetic diversity and heterogeneity of these communities. On the basis of average amino acid identity, comparative genome analysis of >1,000 Pseudomonas genomes, including 21 Pseudomonas strains isolated from the roots of native Populus deltoides (eastern cottonwood) trees resulted in consistent and robust genomic clusters with phylogenetic homogeneity. All Pseudomonas aeruginosa genomes clustered together, and these were clearly distinct from other Pseudomonas species groups on the basis of pangenome and core genome analyses. In contrast, the genomes of Pseudomonas fluorescens were organized into 20 distinct genomic clusters, representing enormous diversity and heterogeneity. Most of our 21 Populus-associated isolates formed three distinct subgroups within the major P. fluorescens group, supported by pathway profile analysis, while two isolates were more closely related to Pseudomonas chlororaphis and Pseudomonas putida. Genes specific to Populus-associated subgroups were identified. Genes specific to subgroup 1 include several sensory systems that act in two-component signal transduction, a TonB-dependent receptor, and a phosphorelay sensor. Genes specific to subgroup 2 contain hypothetical genes, and genes specific to subgroup 3 were annotated with hydrolase activity. This study justifies the need to sequence multiple isolates, especially from P. fluorescens, which displays the most genetic variation, in order to study functional capabilities from a pangenomic perspective. This information will prove useful when choosing Pseudomonas strains for use to promote growth and increase disease resistance in plants.


Asunto(s)
Variación Genética , Genoma Bacteriano , Populus/microbiología , Pseudomonas/clasificación , Pseudomonas/genética , Hibridación Genómica Comparativa , Filogenia , Raíces de Plantas/microbiología , Pseudomonas/aislamiento & purificación , Pseudomonas aeruginosa/genética , Pseudomonas aeruginosa/aislamiento & purificación , Pseudomonas fluorescens/clasificación , Pseudomonas fluorescens/genética , Pseudomonas fluorescens/aislamiento & purificación , Pseudomonas putida/genética , Pseudomonas putida/aislamiento & purificación , Rizosfera , Análisis de Secuencia de ADN
14.
Infect Immun ; 83(5): 1749-64, 2015 May.
Artículo en Inglés | MEDLINE | ID: mdl-25667270

RESUMEN

Urinary tract infections (UTIs) are among the most common infectious diseases of humans, with Escherichia coli responsible for >80% of all cases. One extreme of UTI is asymptomatic bacteriuria (ABU), which occurs as an asymptomatic carrier state that resembles commensalism. To understand the evolution and molecular mechanisms that underpin ABU, the genome of the ABU E. coli strain VR50 was sequenced. Analysis of the complete genome indicated that it most resembles E. coli K-12, with the addition of a 94-kb genomic island (GI-VR50-pheV), eight prophages, and multiple plasmids. GI-VR50-pheV has a mosaic structure and contains genes encoding a number of UTI-associated virulence factors, namely, Afa (afimbrial adhesin), two autotransporter proteins (Ag43 and Sat), and aerobactin. We demonstrated that the presence of this island in VR50 confers its ability to colonize the murine bladder, as a VR50 mutant with GI-VR50-pheV deleted was attenuated in a mouse model of UTI in vivo. We established that Afa is the island-encoded factor responsible for this phenotype using two independent deletion (Afa operon and AfaE adhesin) mutants. E. coli VR50afa and VR50afaE displayed significantly decreased ability to adhere to human bladder epithelial cells. In the mouse model of UTI, VR50afa and VR50afaE displayed reduced bladder colonization compared to wild-type VR50, similar to the colonization level of the GI-VR50-pheV mutant. Our study suggests that E. coli VR50 is a commensal-like strain that has acquired fitness factors that facilitate colonization of the human bladder.


Asunto(s)
Adaptación Biológica , Bacteriuria/microbiología , Portador Sano/microbiología , Infecciones por Escherichia coli/microbiología , Escherichia coli/genética , Evolución Molecular , Sistema Urinario/microbiología , Adulto , Animales , Adhesión Bacteriana , Línea Celular , ADN Bacteriano/química , ADN Bacteriano/genética , Células Epiteliales/microbiología , Escherichia coli/aislamiento & purificación , Femenino , Genoma Bacteriano , Humanos , Ratones Endogámicos C57BL , Modelos Animales , Datos de Secuencia Molecular , Análisis de Secuencia de ADN
15.
Funct Integr Genomics ; 15(2): 141-61, 2015 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-25722247

RESUMEN

Since the first two complete bacterial genome sequences were published in 1995, the science of bacteria has dramatically changed. Using third-generation DNA sequencing, it is possible to completely sequence a bacterial genome in a few hours and identify some types of methylation sites along the genome as well. Sequencing of bacterial genome sequences is now a standard procedure, and the information from tens of thousands of bacterial genomes has had a major impact on our views of the bacterial world. In this review, we explore a series of questions to highlight some insights that comparative genomics has produced. To date, there are genome sequences available from 50 different bacterial phyla and 11 different archaeal phyla. However, the distribution is quite skewed towards a few phyla that contain model organisms. But the breadth is continuing to improve, with projects dedicated to filling in less characterized taxonomic groups. The clustered regularly interspaced short palindromic repeats (CRISPR)-Cas system provides bacteria with immunity against viruses, which outnumber bacteria by tenfold. How fast can we go? Second-generation sequencing has produced a large number of draft genomes (close to 90 % of bacterial genomes in GenBank are currently not complete); third-generation sequencing can potentially produce a finished genome in a few hours, and at the same time provide methlylation sites along the entire chromosome. The diversity of bacterial communities is extensive as is evident from the genome sequences available from 50 different bacterial phyla and 11 different archaeal phyla. Genome sequencing can help in classifying an organism, and in the case where multiple genomes of the same species are available, it is possible to calculate the pan- and core genomes; comparison of more than 2000 Escherichia coli genomes finds an E. coli core genome of about 3100 gene families and a total of about 89,000 different gene families. Why do we care about bacterial genome sequencing? There are many practical applications, such as genome-scale metabolic modeling, biosurveillance, bioforensics, and infectious disease epidemiology. In the near future, high-throughput sequencing of patient metagenomic samples could revolutionize medicine in terms of speed and accuracy of finding pathogens and knowing how to treat them.


Asunto(s)
Genoma Bacteriano , Bacterias/clasificación , Proteínas Bacterianas/genética , Codón , Variación Genética , Tamaño del Genoma , Genómica , Metagenómica , Anotación de Secuencia Molecular , Filogenia , Análisis de Secuencia de ADN
16.
Arch Microbiol ; 197(3): 359-70, 2015 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-25533848

RESUMEN

Microbial taxonomy should provide adequate descriptions of bacterial, archaeal, and eukaryotic microbial diversity in ecological, clinical, and industrial environments. Its cornerstone, the prokaryote species has been re-evaluated twice. It is time to revisit polyphasic taxonomy, its principles, and its practice, including its underlying pragmatic species concept. Ultimately, we will be able to realize an old dream of our predecessor taxonomists and build a genomic-based microbial taxonomy, using standardized and automated curation of high-quality complete genome sequences as the new gold standard.


Asunto(s)
Archaea/clasificación , Archaea/genética , Bacterias/clasificación , Bacterias/genética , Clasificación/métodos , Genómica , Microbiología/tendencias , Simulación por Computador
17.
Proc Natl Acad Sci U S A ; 109(20): E1277-86, 2012 May 15.
Artículo en Inglés | MEDLINE | ID: mdl-22538806

RESUMEN

More than 50 y of research have provided great insight into the physiology, metabolism, and molecular biology of Salmonella enterica serovar Typhimurium (S. Typhimurium), but important gaps in our knowledge remain. It is clear that a precise choreography of gene expression is required for Salmonella infection, but basic genetic information such as the global locations of transcription start sites (TSSs) has been lacking. We combined three RNA-sequencing techniques and two sequencing platforms to generate a robust picture of transcription in S. Typhimurium. Differential RNA sequencing identified 1,873 TSSs on the chromosome of S. Typhimurium SL1344 and 13% of these TSSs initiated antisense transcripts. Unique findings include the TSSs of the virulence regulators phoP, slyA, and invF. Chromatin immunoprecipitation revealed that RNA polymerase was bound to 70% of the TSSs, and two-thirds of these TSSs were associated with σ(70) (including phoP, slyA, and invF) from which we identified the -10 and -35 motifs of σ(70)-dependent S. Typhimurium gene promoters. Overall, we corrected the location of important genes and discovered 18 times more promoters than identified previously. S. Typhimurium expresses 140 small regulatory RNAs (sRNAs) at early stationary phase, including 60 newly identified sRNAs. Almost half of the experimentally verified sRNAs were found to be unique to the Salmonella genus, and <20% were found throughout the Enterobacteriaceae. This description of the transcriptional map of SL1344 advances our understanding of S. Typhimurium, arguably the most important bacterial infection model.


Asunto(s)
Regulación Bacteriana de la Expresión Génica/genética , ARN Pequeño no Traducido/genética , Secuencias Reguladoras de Ácido Ribonucleico/genética , Salmonella typhimurium/genética , Transcripción Genética/genética , Secuencia de Bases , Northern Blotting , Inmunoprecipitación de Cromatina , Biblioteca de Genes , Análisis por Micromatrices , Datos de Secuencia Molecular , Oligonucleótidos/genética , Regiones Promotoras Genéticas/genética , Análisis de Secuencia de ARN/métodos , Sitio de Iniciación de la Transcripción
18.
J Clin Microbiol ; 52(5): 1529-39, 2014 May.
Artículo en Inglés | MEDLINE | ID: mdl-24574292

RESUMEN

One of the first issues that emerges when a prokaryotic organism of interest is encountered is the question of what it is--that is, which species it is. The 16S rRNA gene formed the basis of the first method for sequence-based taxonomy and has had a tremendous impact on the field of microbiology. Nevertheless, the method has been found to have a number of shortcomings. In the current study, we trained and benchmarked five methods for whole-genome sequence-based prokaryotic species identification on a common data set of complete genomes: (i) SpeciesFinder, which is based on the complete 16S rRNA gene; (ii) Reads2Type that searches for species-specific 50-mers in either the 16S rRNA gene or the gyrB gene (for the Enterobacteraceae family); (iii) the ribosomal multilocus sequence typing (rMLST) method that samples up to 53 ribosomal genes; (iv) TaxonomyFinder, which is based on species-specific functional protein domain profiles; and finally (v) KmerFinder, which examines the number of cooccurring k-mers (substrings of k nucleotides in DNA sequence data). The performances of the methods were subsequently evaluated on three data sets of short sequence reads or draft genomes from public databases. In total, the evaluation sets constituted sequence data from more than 11,000 isolates covering 159 genera and 243 species. Our results indicate that methods that sample only chromosomal, core genes have difficulties in distinguishing closely related species which only recently diverged. The KmerFinder method had the overall highest accuracy and correctly identified from 93% to 97% of the isolates in the evaluations sets.


Asunto(s)
Benchmarking/métodos , Clasificación/métodos , Genómica/métodos , Archaea/genética , Bacterias/genética , Proteínas Bacterianas/genética , ADN Bacteriano/genética , Tipificación de Secuencias Multilocus/métodos , ARN Ribosómico 16S/genética
19.
PLoS Biol ; 9(6): e1001088, 2011 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-21713030

RESUMEN

A vast and rich body of information has grown up as a result of the world's enthusiasm for 'omics technologies. Finding ways to describe and make available this information that maximise its usefulness has become a major effort across the 'omics world. At the heart of this effort is the Genomic Standards Consortium (GSC), an open-membership organization that drives community-based standardization activities, Here we provide a short history of the GSC, provide an overview of its range of current activities, and make a call for the scientific community to join forces to improve the quality and quantity of contextual information about our public collections of genomes, metagenomes, and marker gene sequences.


Asunto(s)
Bases de Datos Genéticas , Genómica/normas , Cooperación Internacional , Metagenoma
20.
Front Bioinform ; 4: 1392613, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-39022183

RESUMEN

The major histocompatibility (MHC) locus, also known as the Human Leukocyte Antigen (HLA) genes, is located on the short arm of chromosome 6, and contains three regions (Class I, Class II and Class III). This 5 Mbp locus is one of the most variable regions of the human genome, yet it also encodes a set of highly conserved and important proteins related to immunological response. Genetic variations in this region are responsible for more diseases than in the entire rest of the human genome. However, information on local structural features of the DNA is largely ignored. With recent advances in long-read sequencing technology, it is now becoming possible to sequence the entire 5 Mbp MHC locus, producing complete diploid haplotypes of the whole region. Here, we describe structural maps based on the complete sequences from six different homozygous HLA cell lines. We find long-range structural variability in the different sequences for DNA stacking energy, position preference and curvature, variation in repeats, as well as more local changes in regions forming open chromatin structures, likely to influence gene expression levels. These structural maps can be useful in visualizing large scale structural variation across HLA types, in particular when this can be complemented with epigenetic signals.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA