Búsqueda | OPS/OMS Uruguay

1.

Structurally divergent and recurrently mutated regions of primate genomes.

Mao, Yafei; Harvey, William T; Porubsky, David; Munson, Katherine M; Hoekzema, Kendra; Lewis, Alexandra P; Audano, Peter A; Rozanski, Allison; Yang, Xiangyu; Zhang, Shilong; Yoo, DongAhn; Gordon, David S; Fair, Tyler; Wei, Xiaoxi; Logsdon, Glennis A; Haukness, Marina; Dishuck, Philip C; Jeong, Hyeonsoo; Del Rosario, Ricardo; Bauer, Vanessa L; Fattor, Will T; Wilkerson, Gregory K; Mao, Yuxiang; Shi, Yongyong; Sun, Qiang; Lu, Qing; Paten, Benedict; Bakken, Trygve E; Pollen, Alex A; Feng, Guoping; Sawyer, Sara L; Warren, Wesley C; Carbone, Lucia; Eichler, Evan E.

Cell ; 187(6): 1547-1562.e13, 2024 Mar 14.

Artículo en Inglés | MEDLINE | ID: mdl-38428424

RESUMEN

We sequenced and assembled using multiple long-read sequencing technologies the genomes of chimpanzee, bonobo, gorilla, orangutan, gibbon, macaque, owl monkey, and marmoset. We identified 1,338,997 lineage-specific fixed structural variants (SVs) disrupting 1,561 protein-coding genes and 136,932 regulatory elements, including the most complete set of human-specific fixed differences. We estimate that 819.47 Mbp or â¼27% of the genome has been affected by SVs across primate evolution. We identify 1,607 structurally divergent regions wherein recurrent structural variation contributes to creating SV hotspots where genes are recurrently lost (e.g., CARD, C4, and OLAH gene families) and additional lineage-specific genes are generated (e.g., CKAP2, VPS36, ACBD7, and NEK5 paralogs), becoming targets of rapid chromosomal diversification and positive selection (e.g., RGPD gene family). High-fidelity long-read sequencing has made these dynamic regions of the genome accessible for sequence-level analyses within and between primate species.

Asunto(s)

Genoma , Primates , Animales , Humanos , Secuencia de Bases , Primates/clasificación , Primates/genética , Evolución Biológica , Análisis de Secuencia de ADN , Variación Estructural del Genoma

2.

Recurrent inversion polymorphisms in humans associate with genetic instability and genomic disorders.

Porubsky, David; Höps, Wolfram; Ashraf, Hufsah; Hsieh, PingHsun; Rodriguez-Martin, Bernardo; Yilmaz, Feyza; Ebler, Jana; Hallast, Pille; Maria Maggiolini, Flavia Angela; Harvey, William T; Henning, Barbara; Audano, Peter A; Gordon, David S; Ebert, Peter; Hasenfeld, Patrick; Benito, Eva; Zhu, Qihui; Lee, Charles; Antonacci, Francesca; Steinrücken, Matthias; Beck, Christine R; Sanders, Ashley D; Marschall, Tobias; Eichler, Evan E; Korbel, Jan O.

Cell ; 185(11): 1986-2005.e26, 2022 05 26.

Artículo en Inglés | MEDLINE | ID: mdl-35525246

RESUMEN

Unlike copy number variants (CNVs), inversions remain an underexplored genetic variation class. By integrating multiple genomic technologies, we discover 729 inversions in 41 human genomes. Approximately 85% of inversions <2 kbp form by twin-priming during L1 retrotransposition; 80% of the larger inversions are balanced and affect twice as many nucleotides as CNVs. Balanced inversions show an excess of common variants, and 72% are flanked by segmental duplications (SDs) or retrotransposons. Since flanking repeats promote non-allelic homologous recombination, we developed complementary approaches to identify recurrent inversion formation. We describe 40 recurrent inversions encompassing 0.6% of the genome, showing inversion rates up to 2.7 × 10-4 per locus per generation. Recurrent inversions exhibit a sex-chromosomal bias and co-localize with genomic disorder critical regions. We propose that inversion recurrence results in an elevated number of heterozygous carriers and structural SD diversity, which increases mutability in the population and predisposes specific haplotypes to disease-causing CNVs.

Asunto(s)

Inversión Cromosómica , Duplicaciones Segmentarias en el Genoma , Inversión Cromosómica/genética , Variaciones en el Número de Copia de ADN/genética , Genoma Humano , Genómica , Humanos

3.

The complete sequence and comparative analysis of ape sex chromosomes.

Makova, Kateryna D; Pickett, Brandon D; Harris, Robert S; Hartley, Gabrielle A; Cechova, Monika; Pal, Karol; Nurk, Sergey; Yoo, DongAhn; Li, Qiuhui; Hebbar, Prajna; McGrath, Barbara C; Antonacci, Francesca; Aubel, Margaux; Biddanda, Arjun; Borchers, Matthew; Bornberg-Bauer, Erich; Bouffard, Gerard G; Brooks, Shelise Y; Carbone, Lucia; Carrel, Laura; Carroll, Andrew; Chang, Pi-Chuan; Chin, Chen-Shan; Cook, Daniel E; Craig, Sarah J C; de Gennaro, Luciana; Diekhans, Mark; Dutra, Amalia; Garcia, Gage H; Grady, Patrick G S; Green, Richard E; Haddad, Diana; Hallast, Pille; Harvey, William T; Hickey, Glenn; Hillis, David A; Hoyt, Savannah J; Jeong, Hyeonsoo; Kamali, Kaivan; Pond, Sergei L Kosakovsky; LaPolice, Troy M; Lee, Charles; Lewis, Alexandra P; Loh, Yong-Hwee E; Masterson, Patrick; McGarvey, Kelly M; McCoy, Rajiv C; Medvedev, Paul; Miga, Karen H; Munson, Katherine M.

Nature ; 630(8016): 401-411, 2024 Jun.

Artículo en Inglés | MEDLINE | ID: mdl-38811727

RESUMEN

Apes possess two sex chromosomes-the male-specific Y chromosome and the X chromosome, which is present in both males and females. The Y chromosome is crucial for male reproduction, with deletions being linked to infertility1. The X chromosome is vital for reproduction and cognition2. Variation in mating patterns and brain function among apes suggests corresponding differences in their sex chromosomes. However, owing to their repetitive nature and incomplete reference assemblies, ape sex chromosomes have been challenging to study. Here, using the methodology developed for the telomere-to-telomere (T2T) human genome, we produced gapless assemblies of the X and Y chromosomes for five great apes (bonobo (Pan paniscus), chimpanzee (Pan troglodytes), western lowland gorilla (Gorilla gorilla gorilla), Bornean orangutan (Pongo pygmaeus) and Sumatran orangutan (Pongo abelii)) and a lesser ape (the siamang gibbon (Symphalangus syndactylus)), and untangled the intricacies of their evolution. Compared with the X chromosomes, the ape Y chromosomes vary greatly in size and have low alignability and high levels of structural rearrangements-owing to the accumulation of lineage-specific ampliconic regions, palindromes, transposable elements and satellites. Many Y chromosome genes expand in multi-copy families and some evolve under purifying selection. Thus, the Y chromosome exhibits dynamic evolution, whereas the X chromosome is more stable. Mapping short-read sequencing data to these assemblies revealed diversity and selection patterns on sex chromosomes of more than 100 individual great apes. These reference assemblies are expected to inform human evolution and conservation genetics of non-human apes, all of which are endangered species.

Asunto(s)

Hominidae , Cromosoma X , Cromosoma Y , Animales , Femenino , Masculino , Gorilla gorilla/genética , Hominidae/genética , Hominidae/clasificación , Hylobatidae/genética , Pan paniscus/genética , Pan troglodytes/genética , Filogenia , Pongo abelii/genética , Pongo pygmaeus/genética , Telómero/genética , Cromosoma X/genética , Cromosoma Y/genética , Evolución Molecular , Variaciones en el Número de Copia de ADN/genética , Humanos , Especies en Peligro de Extinción , Estándares de Referencia

4.

Increased mutation and gene conversion within human segmental duplications.

Vollger, Mitchell R; Dishuck, Philip C; Harvey, William T; DeWitt, William S; Guitart, Xavi; Goldberg, Michael E; Rozanski, Allison N; Lucas, Julian; Asri, Mobin; Munson, Katherine M; Lewis, Alexandra P; Hoekzema, Kendra; Logsdon, Glennis A; Porubsky, David; Paten, Benedict; Harris, Kelley; Hsieh, PingHsun; Eichler, Evan E.

Nature ; 617(7960): 325-334, 2023 05.

Artículo en Inglés | MEDLINE | ID: mdl-37165237

RESUMEN

Single-nucleotide variants (SNVs) in segmental duplications (SDs) have not been systematically assessed because of the limitations of mapping short-read sequencing data1,2. Here we constructed 1:1 unambiguous alignments spanning high-identity SDs across 102 human haplotypes and compared the pattern of SNVs between unique and duplicated regions3,4. We find that human SNVs are elevated 60% in SDs compared to unique regions and estimate that at least 23% of this increase is due to interlocus gene conversion (IGC) with up to 4.3 megabase pairs of SD sequence converted on average per human haplotype. We develop a genome-wide map of IGC donors and acceptors, including 498 acceptor and 454 donor hotspots affecting the exons of about 800 protein-coding genes. These include 171 genes that have 'relocated' on average 1.61 megabase pairs in a subset of human haplotypes. Using a coalescent framework, we show that SD regions are slightly evolutionarily older when compared to unique sequences, probably owing to IGC. SNVs in SDs, however, show a distinct mutational spectrum: a 27.1% increase in transversions that convert cytosine to guanine or the reverse across all triplet contexts and a 7.6% reduction in the frequency of CpG-associated mutations when compared to unique DNA. We reason that these distinct mutational properties help to maintain an overall higher GC content of SD DNA compared to that of unique DNA, probably driven by GC-biased conversion between paralogous sequences5,6.

Asunto(s)

Conversión Génica , Mutación , Duplicaciones Segmentarias en el Genoma , Humanos , Conversión Génica/genética , Genoma Humano/genética , Polimorfismo de Nucleótido Simple/genética , Haplotipos/genética , Exones/genética , Citosina/química , Guanina/química , Islas de CpG/genética

5.

Assembly of 43 human Y chromosomes reveals extensive complexity and variation.

Hallast, Pille; Ebert, Peter; Loftus, Mark; Yilmaz, Feyza; Audano, Peter A; Logsdon, Glennis A; Bonder, Marc Jan; Zhou, Weichen; Höps, Wolfram; Kim, Kwondo; Li, Chong; Hoyt, Savannah J; Dishuck, Philip C; Porubsky, David; Tsetsos, Fotios; Kwon, Jee Young; Zhu, Qihui; Munson, Katherine M; Hasenfeld, Patrick; Harvey, William T; Lewis, Alexandra P; Kordosky, Jennifer; Hoekzema, Kendra; O'Neill, Rachel J; Korbel, Jan O; Tyler-Smith, Chris; Eichler, Evan E; Shi, Xinghua; Beck, Christine R; Marschall, Tobias; Konkel, Miriam K; Lee, Charles.

Nature ; 621(7978): 355-364, 2023 Sep.

Artículo en Inglés | MEDLINE | ID: mdl-37612510

RESUMEN

The prevalence of highly repetitive sequences within the human Y chromosome has prevented its complete assembly to date1 and led to its systematic omission from genomic analyses. Here we present de novo assemblies of 43 Y chromosomes spanning 182,900 years of human evolution and report considerable diversity in size and structure. Half of the male-specific euchromatic region is subject to large inversions with a greater than twofold higher recurrence rate compared with all other chromosomes2. Ampliconic sequences associated with these inversions show differing mutation rates that are sequence context dependent, and some ampliconic genes exhibit evidence for concerted evolution with the acquisition and purging of lineage-specific pseudogenes. The largest heterochromatic region in the human genome, Yq12, is composed of alternating repeat arrays that show extensive variation in the number, size and distribution, but retain a 1:1 copy-number ratio. Finally, our data suggest that the boundary between the recombining pseudoautosomal region 1 and the non-recombining portions of the X and Y chromosomes lies 500 kb away from the currently established1 boundary. The availability of fully sequence-resolved Y chromosomes from multiple individuals provides a unique opportunity for identifying new associations of traits with specific Y-chromosomal variants and garnering insights into the evolution and function of complex regions of the human genome.

Asunto(s)

Cromosomas Humanos Y , Evolución Molecular , Humanos , Masculino , Cromosomas Humanos Y/genética , Genoma Humano/genética , Genómica , Tasa de Mutación , Fenotipo , Eucromatina/genética , Seudogenes , Variación Genética/genética , Cromosomas Humanos X/genética , Regiones Pseudoautosómicas/genética

6.

The complete sequence of a human Y chromosome.

Rhie, Arang; Nurk, Sergey; Cechova, Monika; Hoyt, Savannah J; Taylor, Dylan J; Altemose, Nicolas; Hook, Paul W; Koren, Sergey; Rautiainen, Mikko; Alexandrov, Ivan A; Allen, Jamie; Asri, Mobin; Bzikadze, Andrey V; Chen, Nae-Chyun; Chin, Chen-Shan; Diekhans, Mark; Flicek, Paul; Formenti, Giulio; Fungtammasan, Arkarachai; Garcia Giron, Carlos; Garrison, Erik; Gershman, Ariel; Gerton, Jennifer L; Grady, Patrick G S; Guarracino, Andrea; Haggerty, Leanne; Halabian, Reza; Hansen, Nancy F; Harris, Robert; Hartley, Gabrielle A; Harvey, William T; Haukness, Marina; Heinz, Jakob; Hourlier, Thibaut; Hubley, Robert M; Hunt, Sarah E; Hwang, Stephen; Jain, Miten; Kesharwani, Rupesh K; Lewis, Alexandra P; Li, Heng; Logsdon, Glennis A; Lucas, Julian K; Makalowski, Wojciech; Markovic, Christopher; Martin, Fergal J; Mc Cartney, Ann M; McCoy, Rajiv C; McDaniel, Jennifer; McNulty, Brandy M.

Nature ; 621(7978): 344-354, 2023 Sep.

Artículo en Inglés | MEDLINE | ID: mdl-37612512

RESUMEN

The human Y chromosome has been notoriously difficult to sequence and assemble because of its complex repeat structure that includes long palindromes, tandem repeats and segmental duplications1-3. As a result, more than half of the Y chromosome is missing from the GRCh38 reference sequence and it remains the last human chromosome to be finished4,5. Here, the Telomere-to-Telomere (T2T) consortium presents the complete 62,460,029-base-pair sequence of a human Y chromosome from the HG002 genome (T2T-Y) that corrects multiple errors in GRCh38-Y and adds over 30 million base pairs of sequence to the reference, showing the complete ampliconic structures of gene families TSPY, DAZ and RBMY; 41 additional protein-coding genes, mostly from the TSPY family; and an alternating pattern of human satellite 1 and 3 blocks in the heterochromatic Yq12 region. We have combined T2T-Y with a previous assembly of the CHM13 genome4 and mapped available population variation, clinical variants and functional genomics data to produce a complete and comprehensive reference sequence for all 24 human chromosomes.

Asunto(s)

Cromosomas Humanos Y , Genómica , Análisis de Secuencia de ADN , Humanos , Secuencia de Bases , Cromosomas Humanos Y/genética , ADN Satélite/genética , Variación Genética/genética , Genética de Población , Genómica/métodos , Genómica/normas , Heterocromatina/genética , Familia de Multigenes/genética , Estándares de Referencia , Duplicaciones Segmentarias en el Genoma/genética , Análisis de Secuencia de ADN/normas , Secuencias Repetidas en Tándem/genética , Telómero/genética

7.

A draft human pangenome reference.

Liao, Wen-Wei; Asri, Mobin; Ebler, Jana; Doerr, Daniel; Haukness, Marina; Hickey, Glenn; Lu, Shuangjia; Lucas, Julian K; Monlong, Jean; Abel, Haley J; Buonaiuto, Silvia; Chang, Xian H; Cheng, Haoyu; Chu, Justin; Colonna, Vincenza; Eizenga, Jordan M; Feng, Xiaowen; Fischer, Christian; Fulton, Robert S; Garg, Shilpa; Groza, Cristian; Guarracino, Andrea; Harvey, William T; Heumos, Simon; Howe, Kerstin; Jain, Miten; Lu, Tsung-Yu; Markello, Charles; Martin, Fergal J; Mitchell, Matthew W; Munson, Katherine M; Mwaniki, Moses Njagi; Novak, Adam M; Olsen, Hugh E; Pesout, Trevor; Porubsky, David; Prins, Pjotr; Sibbesen, Jonas A; Sirén, Jouni; Tomlinson, Chad; Villani, Flavia; Vollger, Mitchell R; Antonacci-Fulton, Lucinda L; Baid, Gunjan; Baker, Carl A; Belyaeva, Anastasiya; Billis, Konstantinos; Carroll, Andrew; Chang, Pi-Chuan; Cody, Sarah.

Nature ; 617(7960): 312-324, 2023 05.

Artículo en Inglés | MEDLINE | ID: mdl-37165242

RESUMEN

Here the Human Pangenome Reference Consortium presents a first draft of the human pangenome reference. The pangenome contains 47 phased, diploid assemblies from a cohort of genetically diverse individuals1. These assemblies cover more than 99% of the expected sequence in each genome and are more than 99% accurate at the structural and base pair levels. Based on alignments of the assemblies, we generate a draft pangenome that captures known variants and haplotypes and reveals new alleles at structurally complex loci. We also add 119 million base pairs of euchromatic polymorphic sequences and 1,115 gene duplications relative to the existing reference GRCh38. Roughly 90 million of the additional base pairs are derived from structural variation. Using our draft pangenome to analyse short-read data reduced small variant discovery errors by 34% and increased the number of structural variants detected per haplotype by 104% compared with GRCh38-based workflows, which enabled the typing of the vast majority of structural variant alleles per sample.

Asunto(s)

Genoma Humano , Genómica , Humanos , Diploidia , Genoma Humano/genética , Haplotipos/genética , Análisis de Secuencia de ADN , Genómica/normas , Estándares de Referencia , Estudios de Cohortes , Alelos , Variación Genética

8.

Structural and genetic diversity in the secreted mucins MUC5AC and MUC5B.

Plender, Elizabeth G; Prodanov, Timofey; Hsieh, PingHsun; Nizamis, Evangelos; Harvey, William T; Sulovari, Arvis; Munson, Katherine M; Kaufman, Eli J; O'Neal, Wanda K; Valdmanis, Paul N; Marschall, Tobias; Bloom, Jesse D; Eichler, Evan E.

Am J Hum Genet ; 111(8): 1700-1716, 2024 Aug 08.

Artículo en Inglés | MEDLINE | ID: mdl-38991590

RESUMEN

The secreted mucins MUC5AC and MUC5B are large glycoproteins that play critical defensive roles in pathogen entrapment and mucociliary clearance. Their respective genes contain polymorphic and degenerate protein-coding variable number tandem repeats (VNTRs) that make the loci difficult to investigate with short reads. We characterize the structural diversity of MUC5AC and MUC5B by long-read sequencing and assembly of 206 human and 20 nonhuman primate (NHP) haplotypes. We find that human MUC5B is largely invariant (5,761-5,762 amino acids [aa]); however, seven haplotypes have expanded VNTRs (6,291-7,019 aa). In contrast, 30 allelic variants of MUC5AC encode 16 distinct proteins (5,249-6,325 aa) with cysteine-rich domain and VNTR copy-number variation. We group MUC5AC alleles into three phylogenetic clades: H1 (46%, â¼5,654 aa), H2 (33%, â¼5,742 aa), and H3 (7%, â¼6,325 aa). The two most common human MUC5AC variants are smaller than NHP gene models, suggesting a reduction in protein length during recent human evolution. Linkage disequilibrium and Tajima's D analyses reveal that East Asians carry exceptionally large blocks with an excess of rare variation (p < 0.05) at MUC5AC. To validate this result, we use Locityper for genotyping MUC5AC haplogroups in 2,600 unrelated samples from the 1000 Genomes Project. We observe a signature of positive selection in H1 among East Asians and a depletion of the likely ancestral haplogroup (H3). In Europeans, H3 alleles show an excess of common variation and deviate from Hardy-Weinberg equilibrium (p < 0.05), consistent with heterozygote advantage and balancing selection. This study provides a generalizable strategy to characterize complex protein-coding VNTRs for improved disease associations.

Asunto(s)

Alelos , Variación Genética , Haplotipos , Repeticiones de Minisatélite , Mucina 5AC , Mucina 5B , Filogenia , Humanos , Mucina 5B/genética , Animales , Mucina 5AC/genética , Mucina 5AC/metabolismo , Repeticiones de Minisatélite/genética , Variaciones en el Número de Copia de ADN , Primates/genética

9.

AGAP duplicons associate with structural diversity at Chromosome 10q11.22.

Fornezza, Stefania; Delvecchio, Vincenza Simona; Harvey, William T; Dishuck, Philip C; Eichler, Evan E; Giannuzzi, Giuliana.

Genome Res ; 2024 Oct 15.

Artículo en Inglés | MEDLINE | ID: mdl-39322278

RESUMEN

The 10q11.22 chromosomal region is a duplication-rich interval of the human genome and one of the last to be fully assembled. It carries copy number-variable genes associated with intellectual disability, bipolar disorder, and obesity. In this study, we characterized the structural diversity at this locus by analyzing 64 haploid assemblies produced by the Human Pangenome Reference Consortium. We identified 11 alternative haplotypes that differ in the copy number and/or orientation of large genomic segments, ranging from hundreds of kilobase pairs (kbp) to over one megabase pair (Mbp). We uncovered a 2.4 Mbp size difference between the shortest and longest haplotypes. Breakpoint analysis revealed that genomic instability results from nonallelic homologous recombination between segmental duplication (SD) pairs with varying similarity (94.4%-99.6%). Nonetheless, these pairs generally recombine at positions where their identity is higher (>99.6%). Recurrent inversions occur with different breakpoints within the same inverted SD pair. Inversion polymorphisms shuffle the entire SD arrangement, creating new predispositions to copy-number variations. The SD architecture is associated with a catarrhine-specific subgroup of the AGAP gene family, which likely triggered the accumulation of SDs at this locus over the past 25 million years of human evolution. Our results reveal extensive structural diversity and genomic instability at the 10q11.22 locus, and expand the general understanding of the mutational mechanisms behind SD-mediated rearrangements.

10.

The structure, function and evolution of a complete human chromosome 8.

Logsdon, Glennis A; Vollger, Mitchell R; Hsieh, PingHsun; Mao, Yafei; Liskovykh, Mikhail A; Koren, Sergey; Nurk, Sergey; Mercuri, Ludovica; Dishuck, Philip C; Rhie, Arang; de Lima, Leonardo G; Dvorkina, Tatiana; Porubsky, David; Harvey, William T; Mikheenko, Alla; Bzikadze, Andrey V; Kremitzki, Milinn; Graves-Lindsay, Tina A; Jain, Chirag; Hoekzema, Kendra; Murali, Shwetha C; Munson, Katherine M; Baker, Carl; Sorensen, Melanie; Lewis, Alexandra M; Surti, Urvashi; Gerton, Jennifer L; Larionov, Vladimir; Ventura, Mario; Miga, Karen H; Phillippy, Adam M; Eichler, Evan E.

Nature ; 593(7857): 101-107, 2021 05.

Artículo en Inglés | MEDLINE | ID: mdl-33828295

RESUMEN

The complete assembly of each human chromosome is essential for understanding human biology and evolution1,2. Here we use complementary long-read sequencing technologies to complete the linear assembly of human chromosome 8. Our assembly resolves the sequence of five previously long-standing gaps, including a 2.08-Mb centromeric α-satellite array, a 644-kb copy number polymorphism in the ß-defensin gene cluster that is important for disease risk, and an 863-kb variable number tandem repeat at chromosome 8q21.2 that can function as a neocentromere. We show that the centromeric α-satellite array is generally methylated except for a 73-kb hypomethylated region of diverse higher-order α-satellites enriched with CENP-A nucleosomes, consistent with the location of the kinetochore. In addition, we confirm the overall organization and methylation pattern of the centromere in a diploid human genome. Using a dual long-read sequencing approach, we complete high-quality draft assemblies of the orthologous centromere from chromosome 8 in chimpanzee, orangutan and macaque to reconstruct its evolutionary history. Comparative and phylogenetic analyses show that the higher-order α-satellite structure evolved in the great ape ancestor with a layered symmetry, in which more ancient higher-order repeats locate peripherally to monomeric α-satellites. We estimate that the mutation rate of centromeric satellite DNA is accelerated by more than 2.2-fold compared to the unique portions of the genome, and this acceleration extends into the flanking sequence.

Asunto(s)

Cromosomas Humanos Par 8/química , Cromosomas Humanos Par 8/genética , Evolución Molecular , Animales , Línea Celular , Centrómero/química , Centrómero/genética , Centrómero/metabolismo , Cromosomas Humanos Par 8/fisiología , Metilación de ADN , ADN Satélite/genética , Epigénesis Genética , Femenino , Humanos , Macaca mulatta/genética , Masculino , Repeticiones de Minisatélite/genética , Pan troglodytes/genética , Filogenia , Pongo abelii/genética , Telómero/química , Telómero/genética , Telómero/metabolismo

11.

A high-quality bonobo genome refines the analysis of hominid evolution.

Mao, Yafei; Catacchio, Claudia R; Hillier, LaDeana W; Porubsky, David; Li, Ruiyang; Sulovari, Arvis; Fernandes, Jason D; Montinaro, Francesco; Gordon, David S; Storer, Jessica M; Haukness, Marina; Fiddes, Ian T; Murali, Shwetha Canchi; Dishuck, Philip C; Hsieh, PingHsun; Harvey, William T; Audano, Peter A; Mercuri, Ludovica; Piccolo, Ilaria; Antonacci, Francesca; Munson, Katherine M; Lewis, Alexandra P; Baker, Carl; Underwood, Jason G; Hoekzema, Kendra; Huang, Tzu-Hsueh; Sorensen, Melanie; Walker, Jerilyn A; Hoffman, Jinna; Thibaud-Nissen, Françoise; Salama, Sofie R; Pang, Andy W C; Lee, Joyce; Hastie, Alex R; Paten, Benedict; Batzer, Mark A; Diekhans, Mark; Ventura, Mario; Eichler, Evan E.

Nature ; 594(7861): 77-81, 2021 06.

Artículo en Inglés | MEDLINE | ID: mdl-33953399

RESUMEN

The divergence of chimpanzee and bonobo provides one of the few examples of recent hominid speciation1,2. Here we describe a fully annotated, high-quality bonobo genome assembly, which was constructed without guidance from reference genomes by applying a multiplatform genomics approach. We generate a bonobo genome assembly in which more than 98% of genes are completely annotated and 99% of the gaps are closed, including the resolution of about half of the segmental duplications and almost all of the full-length mobile elements. We compare the bonobo genome to those of other great apes1,3-5 and identify more than 5,569 fixed structural variants that specifically distinguish the bonobo and chimpanzee lineages. We focus on genes that have been lost, changed in structure or expanded in the last few million years of bonobo evolution. We produce a high-resolution map of incomplete lineage sorting and estimate that around 5.1% of the human genome is genetically closer to chimpanzee or bonobo and that more than 36.5% of the genome shows incomplete lineage sorting if we consider a deeper phylogeny including gorilla and orangutan. We also show that 26% of the segments of incomplete lineage sorting between human and chimpanzee or human and bonobo are non-randomly distributed and that genes within these clustered segments show significant excess of amino acid replacement compared to the rest of the genome.

Asunto(s)

Evolución Molecular , Genoma/genética , Genómica , Pan paniscus/genética , Filogenia , Animales , Factor 4A Eucariótico de Iniciación/genética , Femenino , Genes , Gorilla gorilla/genética , Anotación de Secuencia Molecular/normas , Pan troglodytes/genética , Pongo/genética , Duplicaciones Segmentarias en el Genoma , Análisis de Secuencia de ADN

12.

Whole-genome long-read sequencing downsampling and its effect on variant-calling precision and recall.

Harvey, William T; Ebert, Peter; Ebler, Jana; Audano, Peter A; Munson, Katherine M; Hoekzema, Kendra; Porubsky, David; Beck, Christine R; Marschall, Tobias; Garimella, Kiran; Eichler, Evan E.

Genome Res ; 33(12): 2029-2040, 2023 12 27.

Artículo en Inglés | MEDLINE | ID: mdl-38190646

RESUMEN

Advances in long-read sequencing (LRS) technologies continue to make whole-genome sequencing more complete, affordable, and accurate. LRS provides significant advantages over short-read sequencing approaches, including phased de novo genome assembly, access to previously excluded genomic regions, and discovery of more complex structural variants (SVs) associated with disease. Limitations remain with respect to cost, scalability, and platform-dependent read accuracy and the tradeoffs between sequence coverage and sensitivity of variant discovery are important experimental considerations for the application of LRS. We compare the genetic variant-calling precision and recall of Oxford Nanopore Technologies (ONT) and Pacific Biosciences (PacBio) HiFi platforms over a range of sequence coverages. For read-based applications, LRS sensitivity begins to plateau around 12-fold coverage with a majority of variants called with reasonable accuracy (F1 score above 0.5), and both platforms perform well for SV detection. Genome assembly increases variant-calling precision and recall of SVs and indels in HiFi data sets with HiFi outperforming ONT in quality as measured by the F1 score of assembly-based variant call sets. While both technologies continue to evolve, our work offers guidance to design cost-effective experimental strategies that do not compromise on discovering novel biology.

Asunto(s)

Genómica , Nanoporos , Mutación INDEL , Secuenciación Completa del Genoma

13.

Gaps and complex structurally variant loci in phased genome assemblies.

Porubsky, David; Vollger, Mitchell R; Harvey, William T; Rozanski, Allison N; Ebert, Peter; Hickey, Glenn; Hasenfeld, Patrick; Sanders, Ashley D; Stober, Catherine; Korbel, Jan O; Paten, Benedict; Marschall, Tobias; Eichler, Evan E.

Genome Res ; 33(4): 496-510, 2023 04.

Artículo en Inglés | MEDLINE | ID: mdl-37164484

RESUMEN

There has been tremendous progress in phased genome assembly production by combining long-read data with parental information or linked-read data. Nevertheless, a typical phased genome assembly generated by trio-hifiasm still generates more than 140 gaps. We perform a detailed analysis of gaps, assembly breaks, and misorientations from 182 haploid assemblies obtained from a diversity panel of 77 unique human samples. Although trio-based approaches using HiFi are the current gold standard, chromosome-wide phasing accuracy is comparable when using Strand-seq instead of parental data. Importantly, the majority of assembly gaps cluster near the largest and most identical repeats (including segmental duplications [35.4%], satellite DNA [22.3%], or regions enriched in GA/AT-rich DNA [27.4%]). Consequently, 1513 protein-coding genes overlap assembly gaps in at least one haplotype, and 231 are recurrently disrupted or missing from five or more haplotypes. Furthermore, we estimate that 6-7 Mbp of DNA are misorientated per haplotype irrespective of whether trio-free or trio-based approaches are used. Of these misorientations, 81% correspond to bona fide large inversion polymorphisms in the human species, most of which are flanked by large segmental duplications. We also identify large-scale alignment discontinuities consistent with 11.9 Mbp of deletions and 161.4 Mbp of insertions per haploid genome. Although 99% of this variation corresponds to satellite DNA, we identify 230 regions of euchromatic DNA with frequent expansions and contractions, nearly half of which overlap with 197 protein-coding genes. Such variable and incompletely assembled regions are important targets for future algorithmic development and pangenome representation.

Asunto(s)

ADN Satélite , Polimorfismo Genético , Humanos , ADN Satélite/genética , Haplotipos , Duplicaciones Segmentarias en el Genoma , Análisis de Secuencia de ADN

14.

Familial long-read sequencing increases yield of de novo mutations.

Noyes, Michelle D; Harvey, William T; Porubsky, David; Sulovari, Arvis; Li, Ruiyang; Rose, Nicholas R; Audano, Peter A; Munson, Katherine M; Lewis, Alexandra P; Hoekzema, Kendra; Mantere, Tuomo; Graves-Lindsay, Tina A; Sanders, Ashley D; Goodwin, Sara; Kramer, Melissa; Mokrab, Younes; Zody, Michael C; Hoischen, Alexander; Korbel, Jan O; McCombie, W Richard; Eichler, Evan E.

Am J Hum Genet ; 109(4): 631-646, 2022 04 07.

Artículo en Inglés | MEDLINE | ID: mdl-35290762

RESUMEN

Studies of de novo mutation (DNM) have typically excluded some of the most repetitive and complex regions of the genome because these regions cannot be unambiguously mapped with short-read sequencing data. To better understand the genome-wide pattern of DNM, we generated long-read sequence data from an autism parent-child quad with an affected female where no pathogenic variant had been discovered in short-read Illumina sequence data. We deeply sequenced all four individuals by using three sequencing platforms (Illumina, Oxford Nanopore, and Pacific Biosciences) and three complementary technologies (Strand-seq, optical mapping, and 10X Genomics). Using long-read sequencing, we initially discovered and validated 171 DNMs across two children-a 20% increase in the number of de novo single-nucleotide variants (SNVs) and indels when compared to short-read callsets. The number of DNMs further increased by 5% when considering a more complete human reference (T2T-CHM13) because of the recovery of events in regions absent from GRCh38 (e.g., three DNMs in heterochromatic satellites). In total, we validated 195 de novo germline mutations and 23 potential post-zygotic mosaic mutations across both children; the overall true substitution rate based on this integrated callset is at least 1.41 × 10-8 substitutions per nucleotide per generation. We also identified six de novo insertions and deletions in tandem repeats, two of which represent structural variants. We demonstrate that long-read sequencing and assembly, especially when combined with a more complete reference genome, increases the number of DNMs by >25% compared to previous studies, providing a more complete catalog of DNM compared to short-read data alone.

Asunto(s)

Genómica , Secuenciación de Nucleótidos de Alto Rendimiento , Femenino , Humanos , Mutación/genética , Nucleótidos , Análisis de Secuencia de ADN , Programas Informáticos

15.

A Bayesian approach to incorporate structural data into the mapping of genotype to antigenic phenotype of influenza A(H3N2) viruses.

Harvey, William T; Davies, Vinny; Daniels, Rodney S; Whittaker, Lynne; Gregory, Victoria; Hay, Alan J; Husmeier, Dirk; McCauley, John W; Reeve, Richard.

PLoS Comput Biol ; 19(3): e1010885, 2023 03.

Artículo en Inglés | MEDLINE | ID: mdl-36972311

RESUMEN

Surface antigens of pathogens are commonly targeted by vaccine-elicited antibodies but antigenic variability, notably in RNA viruses such as influenza, HIV and SARS-CoV-2, pose challenges for control by vaccination. For example, influenza A(H3N2) entered the human population in 1968 causing a pandemic and has since been monitored, along with other seasonal influenza viruses, for the emergence of antigenic drift variants through intensive global surveillance and laboratory characterisation. Statistical models of the relationship between genetic differences among viruses and their antigenic similarity provide useful information to inform vaccine development, though accurate identification of causative mutations is complicated by highly correlated genetic signals that arise due to the evolutionary process. Here, using a sparse hierarchical Bayesian analogue of an experimentally validated model for integrating genetic and antigenic data, we identify the genetic changes in influenza A(H3N2) virus that underpin antigenic drift. We show that incorporating protein structural data into variable selection helps resolve ambiguities arising due to correlated signals, with the proportion of variables representing haemagglutinin positions decisively included, or excluded, increased from 59.8% to 72.4%. The accuracy of variable selection judged by proximity to experimentally determined antigenic sites was improved simultaneously. Structure-guided variable selection thus improves confidence in the identification of genetic explanations of antigenic variation and we also show that prioritising the identification of causative mutations is not detrimental to the predictive capability of the analysis. Indeed, incorporating structural information into variable selection resulted in a model that could more accurately predict antigenic assay titres for phenotypically-uncharacterised virus from genetic sequence. Combined, these analyses have the potential to inform choices of reference viruses, the targeting of laboratory assays, and predictions of the evolutionary success of different genotypes, and can therefore be used to inform vaccine selection processes.

Asunto(s)

COVID-19 , Virus de la Influenza A , Gripe Humana , Humanos , Gripe Humana/prevención & control , Subtipo H3N2 del Virus de la Influenza A/genética , Teorema de Bayes , Glicoproteínas Hemaglutininas del Virus de la Influenza/genética , SARS-CoV-2 , Antígenos Virales/genética , Genotipo , Fenotipo , Anticuerpos Antivirales/genética

16.

Reduced neutralisation of the Delta (B.1.617.2) SARS-CoV-2 variant of concern following vaccination.

Davis, Chris; Logan, Nicola; Tyson, Grace; Orton, Richard; Harvey, William T; Perkins, Jonathan S; Mollett, Guy; Blacow, Rachel M; Peacock, Thomas P; Barclay, Wendy S; Cherepanov, Peter; Palmarini, Massimo; Murcia, Pablo R; Patel, Arvind H; Robertson, David L; Haughney, John; Thomson, Emma C; Willett, Brian J.

PLoS Pathog ; 17(12): e1010022, 2021 12.

Artículo en Inglés | MEDLINE | ID: mdl-34855916

RESUMEN

Vaccines are proving to be highly effective in controlling hospitalisation and deaths associated with SARS-CoV-2 infection but the emergence of viral variants with novel antigenic profiles threatens to diminish their efficacy. Assessment of the ability of sera from vaccine recipients to neutralise SARS-CoV-2 variants will inform the success of strategies for minimising COVID19 cases and the design of effective antigenic formulations. Here, we examine the sensitivity of variants of concern (VOCs) representative of the B.1.617.1 and B.1.617.2 (first associated with infections in India) and B.1.351 (first associated with infection in South Africa) lineages of SARS-CoV-2 to neutralisation by sera from individuals vaccinated with the BNT162b2 (Pfizer/BioNTech) and ChAdOx1 (Oxford/AstraZeneca) vaccines. Across all vaccinated individuals, the spike glycoproteins from B.1.617.1 and B.1.617.2 conferred reductions in neutralisation of 4.31 and 5.11-fold respectively. The reduction seen with the B.1.617.2 lineage approached that conferred by the glycoprotein from B.1.351 (South African) variant (6.29-fold reduction) that is known to be associated with reduced vaccine efficacy. Neutralising antibody titres elicited by vaccination with two doses of BNT162b2 were significantly higher than those elicited by vaccination with two doses of ChAdOx1. Fold decreases in the magnitude of neutralisation titre following two doses of BNT162b2, conferred reductions in titre of 7.77, 11.30 and 9.56-fold respectively to B.1.617.1, B.1.617.2 and B.1.351 pseudoviruses, the reduction in neutralisation of the delta variant B.1.617.2 surpassing that of B.1.351. Fold changes in those vaccinated with two doses of ChAdOx1 were 0.69, 4.01 and 1.48 respectively. The accumulation of mutations in these VOCs, and others, demonstrate the quantifiable risk of antigenic drift and subsequent reduction in vaccine efficacy. Accordingly, booster vaccines based on updated variants are likely to be required over time to prevent productive infection. This study also suggests that two dose regimes of vaccine are required for maximal BNT162b2 and ChAdOx1-induced immunity.

Asunto(s)

Anticuerpos Neutralizantes/inmunología , Anticuerpos Antivirales/inmunología , Vacuna BNT162 , COVID-19 , Inmunización Secundaria , SARS-CoV-2/inmunología , Eficacia de las Vacunas , Deriva y Cambio Antigénico/inmunología , Vacuna BNT162/administración & dosificación , Vacuna BNT162/inmunología , COVID-19/inmunología , COVID-19/mortalidad , COVID-19/prevención & control , Células HEK293 , Humanos

17.

SARS-CoV-2 Evolution and Patient Immunological History Shape the Breadth and Potency of Antibody-Mediated Immunity.

Manali, Maria; Bissett, Laura A; Amat, Julien A R; Logan, Nicola; Scott, Sam; Hughes, Ellen C; Harvey, William T; Orton, Richard; Thomson, Emma C; Gunson, Rory N; Viana, Mafalda; Willett, Brian; Murcia, Pablo R.

J Infect Dis ; 227(1): 40-49, 2022 12 28.

Artículo en Inglés | MEDLINE | ID: mdl-35920058

RESUMEN

Since the emergence of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), humans have been exposed to distinct SARS-CoV-2 antigens, either by infection with different variants, and/or vaccination. Population immunity is thus highly heterogeneous, but the impact of such heterogeneity on the effectiveness and breadth of the antibody-mediated response is unclear. We measured antibody-mediated neutralization responses against SARS-CoV-2Wuhan, SARS-CoV-2α, SARS-CoV-2Î´, and SARS-CoV-2ο pseudoviruses using sera from patients with distinct immunological histories, including naive, vaccinated, infected with SARS-CoV-2Wuhan, SARS-CoV-2α, or SARS-CoV-2Î´, and vaccinated/infected individuals. We show that the breadth and potency of the antibody-mediated response is influenced by the number, the variant, and the nature (infection or vaccination) of exposures, and that individuals with mixed immunity acquired by vaccination and natural exposure exhibit the broadest and most potent responses. Our results suggest that the interplay between host immunity and SARS-CoV-2 evolution will shape the antigenicity and subsequent transmission dynamics of SARS-CoV-2, with important implications for future vaccine design.

Neutralizing antibodies provide protection against viruses and are generated because of vaccination or prior infections. The main target of anti-SARS-CoV-2 neutralizing antibodies is a protein called spike, which decorates the viral particle and mediates viral entry into cells. As SARS-CoV-2 evolves, mutations accumulate in the spike protein, allowing the virus to escape antibody-mediated immunity and decreasing vaccine effectiveness. Multiple SARS-CoV-2 variants have appeared since the start of the COVID-19 pandemic, causing various waves of infection through the population and infectingin some casespeople that had been previously infected or vaccinated. Because the antibody response is highly specific, individuals infected with different variants are likely to have different repertoires of neutralizing antibodies. We studied the breadth and potency of the antibody-mediated response against different SARS-CoV-2 variants using sera from vaccinated people as well as from people infected with different variants. We show that potency of the antibody response against different SARS-CoV-2 variants depends on the particular variant that infected each person, the exposure type (infection or vaccination) and the number and order of exposures. Our study provides insight into the interplay between virus evolution and immunity, as well as important information for the development of better vaccination strategies.

Asunto(s)

COVID-19 , SARS-CoV-2 , Humanos , Anticuerpos , Vacunación , Anticuerpos Antivirales , Anticuerpos Neutralizantes , Glicoproteína de la Espiga del Coronavirus

18.

The SARS-CoV-2 Spike Protein Mutation Explorer: using an interactive application to improve the public understanding of SARS-CoV-2 variants of concern.

Iannucci, Sarah; Harvey, William T; Hughes, Joseph; Robertson, David L; Poyade, Matthieu; Hutchinson, Edward.

J Vis Commun Med ; 46(3): 122-132, 2023 Jul.

Artículo en Inglés | MEDLINE | ID: mdl-37526402

RESUMEN

Due to the COVID-19 pandemic the virus responsible, SARS-CoV-2, became a source of intense interest for non-expert audiences. The viral spike protein gained particular public interest as the main target for protective immune responses, including those elicited by vaccines. The rapid evolution of SARS-CoV-2 resulted in variations in the spike that enhanced transmissibility or weakened vaccine protection. This created new variants of concern (VOCs). The emergence of VOCs was studied using viral sequence data which was shared through portals such as the online Mutation Explorer of the COVID-19 Genomics UK consortium (COG-UK/ME). This was designed for an expert audience, but the information it contained could be of general interest if suitably communicated. Visualisations, interactivity and animation can improve engagement and understanding of molecular biology topics, and so we developed a graphical educational resource, the SARS-CoV-2 Spike Protein Mutation Explorer (SSPME), which used interactive 3D molecular models and animations to explain the molecular biology underpinning VOCs. User testing showed that the SSPME had better usability and improved participant knowledge confidence and knowledge acquisition compared to COG-UK/ME. This demonstrates how interactive visualisations can be used for effective molecular biology communication, as well as improving the public understanding of SARS-CoV-2 VOCs.

Asunto(s)

COVID-19 , SARS-CoV-2 , Humanos , SARS-CoV-2/genética , Glicoproteína de la Espiga del Coronavirus/genética , Pandemias , Mutación

19.

Genetic determinants of receptor-binding preference and zoonotic potential of H9N2 avian influenza viruses.

Peacock, Thomas P; Sealy, Joshua E; Harvey, William T; Benton, Donald J; Reeve, Richard; Iqbal, Munir.

J Virol ; 95(5)2021 03 01.

Artículo en Inglés | MEDLINE | ID: mdl-33268517

RESUMEN

Receptor recognition and binding is the first step of viral infection and a key determinant of host specificity. The inability of avian influenza viruses to effectively bind human-like sialylated receptors is a major impediment to their efficient transmission in humans and pandemic capacity. Influenza H9N2 viruses are endemic in poultry across Asia and parts of Africa where they occasionally infect humans and are therefore considered viruses with zoonotic potential. We previously described H9N2 viruses, including several isolated from human zoonotic cases, showing a preference for human-like receptors. Here we take a mutagenesis approach, making viruses with single or multiple substitutions in H9 haemagglutinin and test binding to avian and human receptor analogues using biolayer interferometry. We determine the genetic basis of preferences for alternative avian receptors and for human-like receptors, describing amino acid motifs at positions 190, 226 and 227 that play a major role in determining receptor specificity, and several other residues such as 159, 188, 193, 196, 198 and 225 that play a smaller role. Furthermore, we show changes at residues 135, 137, 147, 157, 158, 184, 188, and 192 can also modulate virus receptor avidity and that substitutions that increased or decreased the net positive charge around the haemagglutinin receptor-binding site show increases and decreases in avidity, respectively. The motifs we identify as increasing preference for the human-receptor will help guide future H9N2 surveillance efforts and facilitate our understanding of the emergence of influenza viruses with increased zoonotic potential.IMPORTANCE As of 2020, over 60 infections of humans by H9N2 influenza viruses have been recorded in countries where the virus is endemic. Avian-like cellular receptors are the primary target for these viruses. However, given that human infections have been detected on an almost monthly basis since 2015, there may be a capacity for H9N2 viruses to evolve and gain the ability to target human-like cellular receptors. Here we identify molecular signatures that can cause viruses to bind human-like receptors, and we identify the molecular basis for the distinctive preference for sulphated receptors displayed by the majority of recent H9N2 viruses. This work will help guide future surveillance by providing markers that signify the emergence of viruses with enhanced zoonotic potential as well as improving understanding of the basis of influenza virus receptor-binding.

20.

The Microbial Genomes Atlas (MiGA) webserver: taxonomic and gene diversity analysis of Archaea and Bacteria at the whole genome level.

Rodriguez-R, Luis M; Gunturu, Santosh; Harvey, William T; Rosselló-Mora, Ramon; Tiedje, James M; Cole, James R; Konstantinidis, Konstantinos T.

Nucleic Acids Res ; 46(W1): W282-W288, 2018 07 02.

Artículo en Inglés | MEDLINE | ID: mdl-29905870

RESUMEN

The small subunit ribosomal RNA gene (16S rRNA) has been successfully used to catalogue and study the diversity of prokaryotic species and communities but it offers limited resolution at the species and finer levels, and cannot represent the whole-genome diversity and fluidity. To overcome these limitations, we introduced the Microbial Genomes Atlas (MiGA), a webserver that allows the classification of an unknown query genomic sequence, complete or partial, against all taxonomically classified taxa with available genome sequences, as well as comparisons to other related genomes including uncultivated ones, based on the genome-aggregate Average Nucleotide and Amino Acid Identity (ANI/AAI) concepts. MiGA integrates best practices in sequence quality trimming and assembly and allows input to be raw reads or assemblies from isolate genomes, single-cell sequences, and metagenome-assembled genomes (MAGs). Further, MiGA can take as input hundreds of closely related genomes of the same or closely related species (a so-called 'Clade Project') to assess their gene content diversity and evolutionary relationships, and calculate important clade properties such as the pangenome and core gene sets. Therefore, MiGA is expected to facilitate a range of genome-based taxonomic and diversity studies, and quality assessment across environmental and clinical settings. MiGA is available at http://microbial-genomes.org/.

Asunto(s)

Genómica , Internet , ARN Ribosómico 16S/genética , Programas Informáticos , Clasificación , Variación Genética/genética , Genoma Arqueal/genética , Genoma Bacteriano/genética , Filogenia

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA