Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
1.
Proc Natl Acad Sci U S A ; 114(52): 13762-13767, 2017 12 26.
Artículo en Inglés | MEDLINE | ID: mdl-29229821

RESUMEN

Vaccine refusal can lead to renewed outbreaks of previously eliminated diseases and even delay global eradication. Vaccinating decisions exemplify a complex, coupled system where vaccinating behavior and disease dynamics influence one another. Such systems often exhibit critical phenomena-special dynamics close to a tipping point leading to a new dynamical regime. For instance, critical slowing down (declining rate of recovery from small perturbations) may emerge as a tipping point is approached. Here, we collected and geocoded tweets about measles-mumps-rubella vaccine and classified their sentiment using machine-learning algorithms. We also extracted data on measles-related Google searches. We find critical slowing down in the data at the level of California and the United States in the years before and after the 2014-2015 Disneyland, California measles outbreak. Critical slowing down starts growing appreciably several years before the Disneyland outbreak as vaccine uptake declines and the population approaches the tipping point. However, due to the adaptive nature of coupled behavior-disease systems, the population responds to the outbreak by moving away from the tipping point, causing "critical speeding up" whereby resilience to perturbations increases. A mathematical model of measles transmission and vaccine sentiment predicts the same qualitative patterns in the neighborhood of a tipping point to greatly reduced vaccine uptake and large epidemics. These results support the hypothesis that population vaccinating behavior near the disease elimination threshold is a critical phenomenon. Developing new analytical tools to detect these patterns in digital social data might help us identify populations at heightened risk of widespread vaccine refusal.


Asunto(s)
Bases de Datos Factuales , Aprendizaje Automático , Vacunación Masiva , Vacuna contra el Sarampión-Parotiditis-Rubéola/administración & dosificación , Medios de Comunicación Sociales , California , Femenino , Humanos , Masculino
2.
BMC Ecol ; 11: 18, 2011 Aug 01.
Artículo en Inglés | MEDLINE | ID: mdl-21806794

RESUMEN

BACKGROUND: When a specimen belongs to a species not yet represented in DNA barcode reference libraries there is disagreement over the effectiveness of using sequence comparisons to assign the query accurately to a higher taxon. Library completeness and the assignment criteria used have been proposed as critical factors affecting the accuracy of such assignments but have not been thoroughly investigated. We explored the accuracy of assignments to genus, tribe and subfamily in the Sphingidae, using the almost complete global DNA barcode reference library (1095 species) available for this family. Costa Rican sphingids (118 species), a well-documented, diverse subset of the family, with each of the tribes and subfamilies represented were used as queries. We simulated libraries with different levels of completeness (10-100% of the available species), and recorded assignments (positive or ambiguous) and their accuracy (true or false) under six criteria. RESULTS: A liberal tree-based criterion assigned 83% of queries accurately to genus, 74% to tribe and 90% to subfamily, compared to a strict tree-based criterion, which assigned 75% of queries accurately to genus, 66% to tribe and 84% to subfamily, with a library containing 100% of available species (but excluding the species of the query). The greater number of true positives delivered by more relaxed criteria was negatively balanced by the occurrence of more false positives. This effect was most sharply observed with libraries of the lowest completeness where, for example at the genus level, 32% of assignments were false positives with the liberal criterion versus < 1% when using the strict. We observed little difference (< 8% using the liberal criterion) however, in the overall accuracy of the assignments between the lowest and highest levels of library completeness at the tribe and subfamily level. CONCLUSIONS: Our results suggest that when using a strict tree-based criterion for higher taxon assignment with DNA barcodes, the likelihood of assigning a query a genus name incorrectly is very low, if a genus name is provided it has a high likelihood of being accurate, and if no genus match is available the query can nevertheless be assigned to a subfamily with high accuracy regardless of library completeness. DNA barcoding often correctly assigned sphingid moths to higher taxa when species matches were unavailable, suggesting that barcode reference libraries can be useful for higher taxon assignments long before they achieve complete species coverage.


Asunto(s)
Código de Barras del ADN Taxonómico/métodos , Mariposas Nocturnas/clasificación , Mariposas Nocturnas/genética , Animales , Secuencia de Bases , ADN/genética , Código de Barras del ADN Taxonómico/instrumentación , Biblioteca de Genes , Datos de Secuencia Molecular , Filogenia
3.
Microbiol Resour Announc ; 10(1)2021 Jan 07.
Artículo en Inglés | MEDLINE | ID: mdl-33414281

RESUMEN

Here, we report the complete genome sequences for 36 Canadian isolates of Salmonella enterica subsp. enterica serovar Typhimurium and its monophasic variant I 1,4,[5]:12:i:- from both clinical and animal sources. These genome sequences will provide useful references for understanding the genetic variation within this prominent serotype.

4.
Microb Genom ; 7(9)2021 09.
Artículo en Inglés | MEDLINE | ID: mdl-34554082

RESUMEN

Hierarchical genotyping approaches can provide insights into the source, geography and temporal distribution of bacterial pathogens. Multiple hierarchical SNP genotyping schemes have previously been developed so that new isolates can rapidly be placed within pre-computed population structures, without the need to rebuild phylogenetic trees for the entire dataset. This classification approach has, however, seen limited uptake in routine public health settings due to analytical complexity and the lack of standardized tools that provide clear and easy ways to interpret results. The BioHansel tool was developed to provide an organism-agnostic tool for hierarchical SNP-based genotyping. The tool identifies split k-mers that distinguish predefined lineages in whole genome sequencing (WGS) data using SNP-based genotyping schemes. BioHansel uses the Aho-Corasick algorithm to type isolates from assembled genomes or raw read sequence data in a matter of seconds, with limited computational resources. This makes BioHansel ideal for use by public health agencies that rely on WGS methods for surveillance of bacterial pathogens. Genotyping results are evaluated using a quality assurance module which identifies problematic samples, such as low-quality or contaminated datasets. Using existing hierarchical SNP schemes for Mycobacterium tuberculosis and Salmonella Typhi, we compare the genotyping results obtained with the k-mer-based tools BioHansel and SKA, with those of the organism-specific tools TBProfiler and genotyphi, which use gold-standard reference-mapping approaches. We show that the genotyping results are fully concordant across these different methods, and that the k-mer-based tools are significantly faster. We also test the ability of the BioHansel quality assurance module to detect intra-lineage contamination and demonstrate that it is effective, even in populations with low genetic diversity. We demonstrate the scalability of the tool using a dataset of ~8100 S. Typhi public genomes and provide the aggregated results of geographical distributions as part of the tool's output. BioHansel is an open source Python 3 application available on PyPI and Conda repositories and as a Galaxy tool from the public Galaxy Toolshed. In a public health context, BioHansel enables rapid and high-resolution classification of bacterial pathogens with low genetic diversity.


Asunto(s)
Bacterias/genética , Técnicas de Tipificación Bacteriana/métodos , Técnicas de Genotipaje/métodos , Polimorfismo de Nucleótido Simple , Bacterias/clasificación , Bacterias/aislamiento & purificación , Variación Genética , Genoma Bacteriano , Genotipo , Epidemiología Molecular/métodos , Mycobacterium tuberculosis/genética , Filogenia , Salmonella/genética , Programas Informáticos , Secuenciación Completa del Genoma
5.
Microb Genom ; 6(10)2020 10.
Artículo en Inglés | MEDLINE | ID: mdl-32969786

RESUMEN

Bacterial plasmids play a large role in allowing bacteria to adapt to changing environments and can pose a significant risk to human health if they confer virulence and antimicrobial resistance (AMR). Plasmids differ significantly in the taxonomic breadth of host bacteria in which they can successfully replicate, this is commonly referred to as 'host range' and is usually described in qualitative terms of 'narrow' or 'broad'. Understanding the host range potential of plasmids is of great interest due to their ability to disseminate traits such as AMR through bacterial populations and into human pathogens. We developed the MOB-suite to facilitate characterization of plasmids and introduced a whole-sequence-based classification system based on clustering complete plasmid sequences using Mash distances (https://github.com/phac-nml/mob-suite). We updated the MOB-suite database from 12 091 to 23 671 complete sequences, representing 17 779 unique plasmids. With advances in new algorithms for rapidly calculating average nucleotide identity (ANI), we compared clustering characteristics using two different distance measures - Mash and ANI - and three clustering algorithms on the unique set of plasmids. The plasmid nomenclature is designed to group highly similar plasmids together that are unlikely to have multiple representatives within a single cell. Based on our results, we determined that clusters generated using Mash and complete-linkage clustering at a Mash distance of 0.06 resulted in highly homogeneous clusters while maintaining cluster size. The taxonomic distribution of plasmid biomarker sequences for replication and relaxase typing, in combination with MOB-suite whole-sequence-based clusters have been examined in detail for all high-quality publicly available plasmid sequences. We have incorporated prediction of plasmid replication host range into the MOB-suite based on observed distributions of these sequence features in combination with known plasmid hosts from the literature. Host range is reported as the highest taxonomic rank that covers all of the plasmids which share replicon or relaxase biomarkers or belong to the same MOB-suite cluster code. Reporting host range based on these criteria allows for comparisons of host range between studies and provides information for plasmid surveillance.


Asunto(s)
Bacterias/genética , Especificidad del Huésped/genética , Plásmidos/clasificación , Plásmidos/genética , Conjugación Genética/genética , Bases de Datos Genéticas , Humanos , Tipificación Molecular/métodos
6.
Can Commun Dis Rep ; 46(6): 161-168, 2020 Jun 04.
Artículo en Inglés | MEDLINE | ID: mdl-32673380

RESUMEN

Natural language processing (NLP) is a subfield of artificial intelligence devoted to understanding and generation of language. The recent advances in NLP technologies are enabling rapid analysis of vast amounts of text, thereby creating opportunities for health research and evidence-informed decision making. The analysis and data extraction from scientific literature, technical reports, health records, social media, surveys, registries and other documents can support core public health functions including the enhancement of existing surveillance systems (e.g. through faster identification of diseases and risk factors/at-risk populations), disease prevention strategies (e.g. through more efficient evaluation of the safety and effectiveness of interventions) and health promotion efforts (e.g. by providing the ability to obtain expert-level answers to any health related question). NLP is emerging as an important tool that can assist public health authorities in decreasing the burden of health inequality/inequity in the population. The purpose of this paper is to provide some notable examples of both the potential applications and challenges of NLP use in public health.

8.
Biosystems ; 113(1): 9-27, 2013 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-23603215

RESUMEN

The explosion of available sequence data necessitates the development of sophisticated machine learning tools with which to analyze them. This study introduces a sequence-learning technology called side effect machines. It also applies a model of evolution which simulates the evolution of a ring species to the training of the side effect machines. A comparison is done between side effect machines evolved in the ring structure and side effect machines evolved using a standard evolutionary algorithm based on tournament selection. At the core of the training of side effect machines is a nearest neighbor classifier. A parameter study was performed to investigate the impact of the division of training data into examples for nearest neighbor assessment and training cases. The parameter study demonstrates that parameter setting is important in the baseline runs but had little impact in the ring-optimization runs. The ring optimization technique was also found to exhibit improved and also more reliable training performance. Side effect machines are tested on two types of synthetic data, one based on GC-content and the other checking for the ability of side effect machines to recognize an embedded motif. Three types of biological data are used, a data set with different types of immune-system genes, a data set with normal and retro-virally derived human genomic sequence, and standard and nonstandard initiation regions from the cytochrome-oxidase subunit one in the mitochondrial genome.


Asunto(s)
Algoritmos , Inteligencia Artificial , ADN/genética , Modelos Genéticos , Animales , Composición de Base/genética , Secuencia de Bases , Biología Computacional/métodos , ADN/química , ADN/clasificación , Evolución Molecular , Genoma Mitocondrial/genética , Humanos , Complejo Mayor de Histocompatibilidad/genética , Reproducibilidad de los Resultados , Análisis de Secuencia de ADN/métodos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA