Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 2 de 2
Filtrar
Más filtros

Banco de datos
Tipo de estudio
Tipo del documento
Intervalo de año de publicación
1.
Cell ; 184(13): 3376-3393.e17, 2021 06 24.
Artículo en Inglés | MEDLINE | ID: mdl-34043940

RESUMEN

We present a global atlas of 4,728 metagenomic samples from mass-transit systems in 60 cities over 3 years, representing the first systematic, worldwide catalog of the urban microbial ecosystem. This atlas provides an annotated, geospatial profile of microbial strains, functional characteristics, antimicrobial resistance (AMR) markers, and genetic elements, including 10,928 viruses, 1,302 bacteria, 2 archaea, and 838,532 CRISPR arrays not found in reference databases. We identified 4,246 known species of urban microorganisms and a consistent set of 31 species found in 97% of samples that were distinct from human commensal organisms. Profiles of AMR genes varied widely in type and density across cities. Cities showed distinct microbial taxonomic signatures that were driven by climate and geographic differences. These results constitute a high-resolution global metagenomic atlas that enables discovery of organisms and genes, highlights potential public health and forensic applications, and provides a culture-independent view of AMR burden in cities.


Asunto(s)
Farmacorresistencia Bacteriana/genética , Metagenómica , Microbiota/genética , Población Urbana , Biodiversidad , Bases de Datos Genéticas , Humanos
2.
Genome Biol ; 25(1): 83, 2024 04 02.
Artículo en Inglés | MEDLINE | ID: mdl-38566111

RESUMEN

BACKGROUND: The rise of large-scale multi-species genome sequencing projects promises to shed new light on how genomes encode gene regulatory instructions. To this end, new algorithms are needed that can leverage conservation to capture regulatory elements while accounting for their evolution. RESULTS: Here, we introduce species-aware DNA language models, which we trained on more than 800 species spanning over 500 million years of evolution. Investigating their ability to predict masked nucleotides from context, we show that DNA language models distinguish transcription factor and RNA-binding protein motifs from background non-coding sequence. Owing to their flexibility, DNA language models capture conserved regulatory elements over much further evolutionary distances than sequence alignment would allow. Remarkably, DNA language models reconstruct motif instances bound in vivo better than unbound ones and account for the evolution of motif sequences and their positional constraints, showing that these models capture functional high-order sequence and evolutionary context. We further show that species-aware training yields improved sequence representations for endogenous and MPRA-based gene expression prediction, as well as motif discovery. CONCLUSIONS: Collectively, these results demonstrate that species-aware DNA language models are a powerful, flexible, and scalable tool to integrate information from large compendia of highly diverged genomes.


Asunto(s)
ADN , Secuencias Reguladoras de Ácidos Nucleicos , Sitios de Unión , Alineación de Secuencia , Algoritmos , Secuencia Conservada/genética , Evolución Molecular
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA