RESUMEN
We sequenced and assembled using multiple long-read sequencing technologies the genomes of chimpanzee, bonobo, gorilla, orangutan, gibbon, macaque, owl monkey, and marmoset. We identified 1,338,997 lineage-specific fixed structural variants (SVs) disrupting 1,561 protein-coding genes and 136,932 regulatory elements, including the most complete set of human-specific fixed differences. We estimate that 819.47 Mbp or â¼27% of the genome has been affected by SVs across primate evolution. We identify 1,607 structurally divergent regions wherein recurrent structural variation contributes to creating SV hotspots where genes are recurrently lost (e.g., CARD, C4, and OLAH gene families) and additional lineage-specific genes are generated (e.g., CKAP2, VPS36, ACBD7, and NEK5 paralogs), becoming targets of rapid chromosomal diversification and positive selection (e.g., RGPD gene family). High-fidelity long-read sequencing has made these dynamic regions of the genome accessible for sequence-level analyses within and between primate species.
Asunto(s)
Genoma , Primates , Animales , Humanos , Secuencia de Bases , Primates/clasificación , Primates/genética , Evolución Biológica , Análisis de Secuencia de ADN , Variación Estructural del GenomaRESUMEN
Antarctic krill (Euphausia superba) is Earth's most abundant wild animal, and its enormous biomass is vital to the Southern Ocean ecosystem. Here, we report a 48.01-Gb chromosome-level Antarctic krill genome, whose large genome size appears to have resulted from inter-genic transposable element expansions. Our assembly reveals the molecular architecture of the Antarctic krill circadian clock and uncovers expanded gene families associated with molting and energy metabolism, providing insights into adaptations to the cold and highly seasonal Antarctic environment. Population-level genome re-sequencing from four geographical sites around the Antarctic continent reveals no clear population structure but highlights natural selection associated with environmental variables. An apparent drastic reduction in krill population size 10 mya and a subsequent rebound 100 thousand years ago coincides with climate change events. Our findings uncover the genomic basis of Antarctic krill adaptations to the Southern Ocean and provide valuable resources for future Antarctic research.
Asunto(s)
Euphausiacea , Genoma , Animales , Relojes Circadianos/genética , Ecosistema , Euphausiacea/genética , Euphausiacea/fisiología , Genómica , Análisis de Secuencia de ADN , Elementos Transponibles de ADN , Evolución Biológica , Adaptación FisiológicaRESUMEN
Electrically conductive appendages from the anaerobic bacterium Geobacter sulfurreducens, recently identified as extracellular cytochrome nanowires (ECNs), have received wide attention due to numerous potential applications. However, whether other organisms employ similar ECNs for electron transfer remains unknown. Here, using cryoelectron microscopy, we describe the atomic structures of two ECNs from two major orders of hyperthermophilic archaea present in deep-sea hydrothermal vents and terrestrial hot springs. Homologs of Archaeoglobus veneficus ECN are widespread among mesophilic methane-oxidizing Methanoperedenaceae, alkane-degrading Syntrophoarchaeales archaea, and in the recently described megaplasmids called Borgs. The ECN protein subunits lack similarities in their folds; however, they share a common heme arrangement, suggesting an evolutionarily optimized heme packing for efficient electron transfer. The detection of ECNs in archaea suggests that filaments containing closely stacked hemes may be a common and widespread mechanism for long-range electron transfer in both prokaryotic domains of life.
Asunto(s)
Nanocables , Microscopía por Crioelectrón , Composición de Base , Filogenia , ARN Ribosómico 16S , Análisis de Secuencia de ADN , Transporte de Electrón , Citocromos , Archaea , HemoRESUMEN
Synthetic genomics is the construction of viruses, bacteria, and eukaryotic cells with synthetic genomes. It involves two basic processes: synthesis of complete genomes or chromosomes and booting up of those synthetic nucleic acids to make viruses or living cells. The first synthetic genomics efforts resulted in the construction of viruses. This led to a revolution in viral reverse genetics and improvements in vaccine design and manufacture. The first bacterium with a synthetic genome led to construction of a minimal bacterial cell and recoded Escherichia coli strains able to incorporate multiple non-standard amino acids in proteins and resistant to phage infection. Further advances led to a yeast strain with a synthetic genome and new approaches for animal and plant artificial chromosomes. On the horizon there are dramatic advances in DNA synthesis that will enable extraordinary new opportunities in medicine, industry, agriculture, and research.
Asunto(s)
Bacteriófagos , Cromosomas , Animales , Bacteriófagos/genética , Cromosomas/genética , Escherichia coli/genética , Genoma Viral , Genómica/métodos , Saccharomyces cerevisiae/genética , Análisis de Secuencia de ADN , Biología Sintética/métodosRESUMEN
The Middle East region is important to understand human evolution and migrations but is underrepresented in genomic studies. Here, we generated 137 high-coverage physically phased genome sequences from eight Middle Eastern populations using linked-read sequencing. We found no genetic traces of early expansions out-of-Africa in present-day populations but found Arabians have elevated Basal Eurasian ancestry that dilutes their Neanderthal ancestry. Population sizes within the region started diverging 15-20 kya, when Levantines expanded while Arabians maintained smaller populations that derived ancestry from local hunter-gatherers. Arabians suffered a population bottleneck around the aridification of Arabia 6 kya, while Levantines had a distinct bottleneck overlapping the 4.2 kya aridification event. We found an association between movement and admixture of populations in the region and the spread of Semitic languages. Finally, we identify variants that show evidence of selection, including polygenic selection. Our results provide detailed insights into the genomic and selective histories of the Middle East.
Asunto(s)
Genética de Población/historia , Genoma Humano , Animales , Cromosomas Humanos Y/genética , Bases de Datos Genéticas , Pool de Genes , Introgresión Genética , Geografía , Historia Antigua , Migración Humana , Humanos , Medio Oriente , Modelos Genéticos , Hombre de Neandertal/genética , Filogenia , Densidad de Población , Selección Genética , Análisis de Secuencia de ADNRESUMEN
Industrialization has impacted the human gut ecosystem, resulting in altered microbiome composition and diversity. Whether bacterial genomes may also adapt to the industrialization of their host populations remains largely unexplored. Here, we investigate the extent to which the rates and targets of horizontal gene transfer (HGT) vary across thousands of bacterial strains from 15 human populations spanning a range of industrialization. We show that HGTs have accumulated in the microbiome over recent host generations and that HGT occurs at high frequency within individuals. Comparison across human populations reveals that industrialized lifestyles are associated with higher HGT rates and that the functions of HGTs are related to the level of host industrialization. Our results suggest that gut bacteria continuously acquire new functionality based on host lifestyle and that high rates of HGT may be a recent development in human history linked to industrialization.
Asunto(s)
Bacterias/genética , Microbioma Gastrointestinal , Transferencia de Gen Horizontal , Bacterias/clasificación , Bacterias/aislamiento & purificación , ADN Bacteriano/química , ADN Bacteriano/aislamiento & purificación , ADN Bacteriano/metabolismo , Heces/microbiología , Genoma Bacteriano , Humanos , Filogenia , Población Rural , Análisis de Secuencia de ADN , Población Urbana , Secuenciación Completa del GenomaRESUMEN
There are many unanswered questions about the population history of the Central and South Central Andes, particularly regarding the impact of large-scale societies, such as the Moche, Wari, Tiwanaku, and Inca. We assembled genome-wide data on 89 individuals dating from â¼9,000-500 years ago (BP), with a particular focus on the period of the rise and fall of state societies. Today's genetic structure began to develop by 5,800 BP, followed by bi-directional gene flow between the North and South Highlands, and between the Highlands and Coast. We detect minimal admixture among neighboring groups between â¼2,000-500 BP, although we do detect cosmopolitanism (people of diverse ancestries living side-by-side) in the heartlands of the Tiwanaku and Inca polities. We also highlight cases of long-range mobility connecting the Andes to Argentina and the Northwest Andes to the Amazon Basin. VIDEO ABSTRACT.
Asunto(s)
Antropología/métodos , ADN Antiguo/análisis , Flujo Génico/genética , América Central , ADN Mitocondrial/genética , Flujo Génico/fisiología , Genética de Población/métodos , Haplotipos , Humanos , Análisis de Secuencia de ADN , América del SurRESUMEN
Diffuse gliomas inevitably progress, but our understanding of the molecular events associated with recurrence is limited. Recent work from the Glioma Longitudinal Analysis (GLASS) consortium (Barthel et al., 2019) reports temporal DNA sequencing on a large cohort of primary and recurrent glioma pairs, establishing the evolutionary molecular characteristics of adult diffuse gliomas.
Asunto(s)
Neoplasias Encefálicas , Glioma , Adulto , Estudios de Cohortes , Humanos , Recurrencia Local de Neoplasia , Análisis de Secuencia de ADNRESUMEN
Comprehensive analysis of neuronal networks requires brain-wide measurement of connectivity, activity, and gene expression. Although high-throughput methods are available for mapping brain-wide activity and transcriptomes, comparable methods for mapping region-to-region connectivity remain slow and expensive because they require averaging across hundreds of brains. Here we describe BRICseq (brain-wide individual animal connectome sequencing), which leverages DNA barcoding and sequencing to map connectivity from single individuals in a few weeks and at low cost. Applying BRICseq to the mouse neocortex, we find that region-to-region connectivity provides a simple bridge relating transcriptome to activity: the spatial expression patterns of a few genes predict region-to-region connectivity, and connectivity predicts activity correlations. We also exploited BRICseq to map the mutant BTBR mouse brain, which lacks a corpus callosum, and recapitulated its known connectopathies. BRICseq allows individual laboratories to compare how age, sex, environment, genetics, and species affect neuronal wiring and to integrate these with functional activity and gene expression.
Asunto(s)
Conectoma , Regulación de la Expresión Génica , Red Nerviosa/fisiología , Neuronas/fisiología , Análisis de Secuencia de ADN , Animales , Mapeo Encefálico , Toma de Decisiones , Masculino , Ratones Endogámicos C57BL , Ratones Mutantes Neurológicos , Reproducibilidad de los Resultados , Análisis y Desempeño de TareasRESUMEN
We report genome-wide DNA data for 73 individuals from five archaeological sites across the Bronze and Iron Ages Southern Levant. These individuals, who share the "Canaanite" material culture, can be modeled as descending from two sources: (1) earlier local Neolithic populations and (2) populations related to the Chalcolithic Zagros or the Bronze Age Caucasus. The non-local contribution increased over time, as evinced by three outliers who can be modeled as descendants of recent migrants. We show evidence that different "Canaanite" groups genetically resemble each other more than other populations. We find that Levant-related modern populations typically have substantial ancestry coming from populations related to the Chalcolithic Zagros and the Bronze Age Southern Levant. These groups also harbor ancestry from sources we cannot fully model with the available data, highlighting the critical role of post-Bronze-Age migrations into the region over the past 3,000 years.
Asunto(s)
ADN Antiguo/análisis , Etnicidad/genética , Flujo Génico/genética , Arqueología/métodos , ADN Mitocondrial/genética , Etnicidad/historia , Flujo Génico/fisiología , Variación Genética/genética , Genética de Población/métodos , Genoma Humano/genética , Genómica/métodos , Haplotipos , Historia Antigua , Migración Humana/historia , Humanos , Región Mediterránea , Medio Oriente , Análisis de Secuencia de ADNRESUMEN
Here, we report genome-wide data analyses from 110 ancient Near Eastern individuals spanning the Late Neolithic to Late Bronze Age, a period characterized by intense interregional interactions for the Near East. We find that 6th millennium BCE populations of North/Central Anatolia and the Southern Caucasus shared mixed ancestry on a genetic cline that formed during the Neolithic between Western Anatolia and regions in today's Southern Caucasus/Zagros. During the Late Chalcolithic and/or the Early Bronze Age, more than half of the Northern Levantine gene pool was replaced, while in the rest of Anatolia and the Southern Caucasus, we document genetic continuity with only transient gene flow. Additionally, we reveal a genetically distinct individual within the Late Bronze Age Northern Levant. Overall, our study uncovers multiple scales of population dynamics through time, from extensive admixture during the Neolithic period to long-distance mobility within the globalized societies of the Late Bronze Age. VIDEO ABSTRACT.
Asunto(s)
ADN Antiguo/análisis , Etnicidad/genética , Flujo Génico/genética , Arqueología/métodos , ADN Mitocondrial/genética , Etnicidad/historia , Flujo Génico/fisiología , Variación Genética/genética , Genética de Población/métodos , Genoma Humano/genética , Genómica/métodos , Haplotipos , Historia Antigua , Migración Humana/historia , Humanos , Región Mediterránea , Medio Oriente , Análisis de Secuencia de ADNRESUMEN
Although complex inflammatory-like alterations are observed around the amyloid plaques of Alzheimer's disease (AD), little is known about the molecular changes and cellular interactions that characterize this response. We investigate here, in an AD mouse model, the transcriptional changes occurring in tissue domains in a 100-µm diameter around amyloid plaques using spatial transcriptomics. We demonstrate early alterations in a gene co-expression network enriched for myelin and oligodendrocyte genes (OLIGs), whereas a multicellular gene co-expression network of plaque-induced genes (PIGs) involving the complement system, oxidative stress, lysosomes, and inflammation is prominent in the later phase of the disease. We confirm the majority of the observed alterations at the cellular level using in situ sequencing on mouse and human brain sections. Genome-wide spatial transcriptomics analysis provides an unprecedented approach to untangle the dysregulated cellular network in the vicinity of pathogenic hallmarks of AD and other brain diseases.
Asunto(s)
Enfermedad de Alzheimer/patología , Análisis de Secuencia de ADN/métodos , Transcriptoma , Enfermedad de Alzheimer/genética , Amiloide/metabolismo , Péptidos beta-Amiloides/genética , Péptidos beta-Amiloides/metabolismo , Animales , Encéfalo/metabolismo , Encéfalo/patología , Proteínas del Sistema Complemento/genética , Proteínas del Sistema Complemento/metabolismo , Modelos Animales de Enfermedad , Perfilación de la Expresión Génica , Humanos , Lisosomas/genética , Lisosomas/metabolismo , Masculino , Ratones , Ratones Endogámicos C57BL , Ratones Transgénicos , Vaina de Mielina/genética , Vaina de Mielina/metabolismo , Estrés Oxidativo/genéticaRESUMEN
Pathogenic autoantibodies arise in many autoimmune diseases, but it is not understood how the cells making them evade immune checkpoints. Here, single-cell multi-omics analysis demonstrates a shared mechanism with lymphoid malignancy in the formation of public rheumatoid factor autoantibodies responsible for mixed cryoglobulinemic vasculitis. By combining single-cell DNA and RNA sequencing with serum antibody peptide sequencing and antibody synthesis, rare circulating B lymphocytes making pathogenic autoantibodies were found to comprise clonal trees accumulating mutations. Lymphoma driver mutations in genes regulating B cell proliferation and V(D)J mutation (CARD11, TNFAIP3, CCND3, ID3, BTG2, and KLHL6) were present in rogue B cells producing the pathogenic autoantibody. Antibody V(D)J mutations conferred pathogenicity by causing the antigen-bound autoantibodies to undergo phase transition to insoluble aggregates at lower temperatures. These results reveal a pre-neoplastic stage in human lymphomagenesis and a cascade of somatic mutations leading to an iconic pathogenic autoantibody.
Asunto(s)
Autoanticuerpos/genética , Enfermedades Autoinmunes/genética , Linfocitos B/inmunología , Linfoma/genética , Animales , Autoanticuerpos/inmunología , Enfermedades Autoinmunes/inmunología , Enfermedades Autoinmunes/patología , Linfocitos B/patología , Proteínas Adaptadoras de Señalización CARD/genética , Proteínas Portadoras/genética , Evolución Clonal/genética , Evolución Clonal/inmunología , Ciclina D3/genética , Guanilato Ciclasa/genética , Humanos , Proteínas Inmediatas-Precoces/genética , Región Variable de Inmunoglobulina/genética , Región Variable de Inmunoglobulina/inmunología , Proteínas Inhibidoras de la Diferenciación/genética , Linfoma/inmunología , Linfoma/patología , Ratones , Mutación/genética , Mutación/inmunología , Proteínas de Neoplasias/genética , Análisis de Secuencia de ADN/métodos , Análisis de Secuencia de ARN/métodos , Análisis de la Célula Individual/métodos , Proteína 3 Inducida por el Factor de Necrosis Tumoral alfa/genética , Proteínas Supresoras de Tumor/genética , Recombinación V(D)J/genéticaRESUMEN
Affordable genome sequencing technologies promise to revolutionize the field of human genetics by enabling comprehensive studies that interrogate all classes of genome variation, genome-wide, across the entire allele frequency spectrum. Ongoing projects worldwide are sequencing many thousands-and soon millions-of human genomes as part of various gene mapping studies, biobanking efforts, and clinical programs. However, while genome sequencing data production has become routine, genome analysis and interpretation remain challenging endeavors with many limitations and caveats. Here, we review the current state of technologies for genetic variant discovery, genotyping, and functional interpretation and discuss the prospects for future advances. We focus on germline variants discovered by whole-genome sequencing, genome-wide functional genomic approaches for predicting and measuring variant functional effects, and implications for studies of common and rare human disease.
Asunto(s)
Variación Genética/genética , Genoma Humano/genética , Análisis de Secuencia de ADN/tendencias , Bancos de Muestras Biológicas , Mapeo Cromosómico/métodos , Predisposición Genética a la Enfermedad/genética , Pruebas Genéticas/tendencias , Estudio de Asociación del Genoma Completo , Genómica/métodos , Genómica/tendencias , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Proyecto Genoma Humano , Humanos , Polimorfismo de Nucleótido Simple/genética , Análisis de Secuencia de ADN/métodos , Secuenciación Completa del Genoma/métodos , Secuenciación Completa del Genoma/tendenciasRESUMEN
The body-wide human microbiome plays a role in health, but its full diversity remains uncharacterized, particularly outside of the gut and in international populations. We leveraged 9,428 metagenomes to reconstruct 154,723 microbial genomes (45% of high quality) spanning body sites, ages, countries, and lifestyles. We recapitulated 4,930 species-level genome bins (SGBs), 77% without genomes in public repositories (unknown SGBs [uSGBs]). uSGBs are prevalent (in 93% of well-assembled samples), expand underrepresented phyla, and are enriched in non-Westernized populations (40% of the total SGBs). We annotated 2.85 M genes in SGBs, many associated with conditions including infant development (94,000) or Westernization (106,000). SGBs and uSGBs permit deeper microbiome analyses and increase the average mappability of metagenomic reads from 67.76% to 87.51% in the gut (median 94.26%) and 65.14% to 82.34% in the mouth. We thus identify thousands of microbial genomes from yet-to-be-named species, expand the pangenomes of human-associated microbes, and allow better exploitation of metagenomic technologies.
Asunto(s)
Metagenoma/genética , Metagenómica/métodos , Microbiota/genética , Macrodatos , Variación Genética/genética , Geografía , Humanos , Estilo de Vida , Filogenia , Análisis de Secuencia de ADN/métodosRESUMEN
In order to provide a comprehensive resource for human structural variants (SVs), we generated long-read sequence data and analyzed SVs for fifteen human genomes. We sequence resolved 99,604 insertions, deletions, and inversions including 2,238 (1.6 Mbp) that are shared among all discovery genomes with an additional 13,053 (6.9 Mbp) present in the majority, indicating minor alleles or errors in the reference. Genotyping in 440 additional genomes confirms the most common SVs in unique euchromatin are now sequence resolved. We report a ninefold SV bias toward the last 5 Mbp of human chromosomes with nearly 55% of all VNTRs (variable number of tandem repeats) mapping to this portion of the genome. We identify SVs affecting coding and noncoding regulatory loci improving annotation and interpretation of functional variation. These data provide the framework to construct a canonical human reference and a resource for developing advanced representations capable of capturing allelic diversity.
Asunto(s)
Frecuencia de los Genes/genética , Genoma Humano/genética , Variación Estructural del Genoma/genética , Alelos , Eucromatina/genética , Genómica/métodos , Humanos , Repeticiones de Minisatélite/genética , Análisis de Secuencia de ADN/métodosRESUMEN
DNA rearrangements resulting in human genome structural variants (SVs) are caused by diverse mutational mechanisms. We used long- and short-read sequencing technologies to investigate end products of de novo chromosome 17p11.2 rearrangements and query the molecular mechanisms underlying both recurrent and non-recurrent events. Evidence for an increased rate of clustered single-nucleotide variant (SNV) mutation in cis with non-recurrent rearrangements was found. Indel and SNV formation are associated with both copy-number gains and losses of 17p11.2, occur up to â¼1 Mb away from the breakpoint junctions, and favor C > G transversion substitutions; results suggest that single-stranded DNA is formed during the genesis of the SV and provide compelling support for a microhomology-mediated break-induced replication (MMBIR) mechanism for SV formation. Our data show an additional mutational burden of MMBIR consisting of hypermutation confined to the locus and manifesting as SNVs and indels predominantly within genes.
Asunto(s)
Cromosomas Humanos Par 17 , Mutación , Anomalías Múltiples/genética , Puntos de Rotura del Cromosoma , Trastornos de los Cromosomas/genética , Duplicación Cromosómica/genética , Variaciones en el Número de Copia de ADN , Reparación del ADN/genética , Replicación del ADN , Reordenamiento Génico , Genoma Humano , Variación Estructural del Genoma , Humanos , Mutación INDEL , Modelos Genéticos , Polimorfismo de Nucleótido Simple , Recombinación Genética , Análisis de Secuencia de ADN/métodos , Síndrome de Smith-Magenis/genéticaRESUMEN
The introduction of exome sequencing in the clinic has sparked tremendous optimism for the future of rare disease diagnosis, and there is exciting opportunity to further leverage these advances. To provide diagnostic clarity to all of these patients, however, there is a critical need for the field to develop and implement strategies to understand the mechanisms underlying all rare diseases and translate these to clinical care.
Asunto(s)
Secuenciación del Exoma/tendencias , Enfermedades Raras/diagnóstico , Investigación Biomédica Traslacional/métodos , Exoma , Pruebas Genéticas , Genoma Humano/genética , Secuenciación de Nucleótidos de Alto Rendimiento/tendencias , Humanos , Enfermedades Raras/genética , Análisis de Secuencia de ADN/métodos , Secuenciación del Exoma/métodosRESUMEN
Here, we present Perturb-ATAC, a method that combines multiplexed CRISPR interference or knockout with genome-wide chromatin accessibility profiling in single cells based on the simultaneous detection of CRISPR guide RNAs and open chromatin sites by assay of transposase-accessible chromatin with sequencing (ATAC-seq). We applied Perturb-ATAC to transcription factors (TFs), chromatin-modifying factors, and noncoding RNAs (ncRNAs) in â¼4,300 single cells, encompassing more than 63 genotype-phenotype relationships. Perturb-ATAC in human B lymphocytes uncovered regulators of chromatin accessibility, TF occupancy, and nucleosome positioning and identified a hierarchy of TFs that govern B cell state, variation, and disease-associated cis-regulatory elements. Perturb-ATAC in primary human epidermal cells revealed three sequential modules of cis-elements that specify keratinocyte fate. Combinatorial deletion of all pairs of these TFs uncovered their epistatic relationships and highlighted genomic co-localization as a basis for synergistic interactions. Thus, Perturb-ATAC is a powerful strategy to dissect gene regulatory networks in development and disease.
Asunto(s)
Epigenómica/métodos , Redes Reguladoras de Genes/genética , Análisis de la Célula Individual/métodos , Cromatina/genética , Cromatina/metabolismo , Ensamble y Desensamble de Cromatina/fisiología , Repeticiones Palindrómicas Cortas Agrupadas y Regularmente Espaciadas/fisiología , Redes Reguladoras de Genes/fisiología , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Análisis de Secuencia de ADN/métodos , Factores de Transcripción/metabolismoRESUMEN
Metagenomic sequencing is revolutionizing the detection and characterization of microbial species, and a wide variety of software tools are available to perform taxonomic classification of these data. The fast pace of development of these tools and the complexity of metagenomic data make it important that researchers are able to benchmark their performance. Here, we review current approaches for metagenomic analysis and evaluate the performance of 20 metagenomic classifiers using simulated and experimental datasets. We describe the key metrics used to assess performance, offer a framework for the comparison of additional classifiers, and discuss the future of metagenomic data analysis.