Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 457
Filtrar
1.
Genome Res ; 2024 Aug 06.
Artigo em Inglês | MEDLINE | ID: mdl-39107043

RESUMO

TBC1D3 is a primate-specific gene family that has expanded in the human lineage and has been implicated in neuronal progenitor proliferation and expansion of the frontal cortex. The gene family and its expression have been challenging to investigate because it is embedded in high-identity and highly variable segmental duplications. We sequenced and assembled the gene family using long-read sequencing data from 34 humans and 11 non-human primate species. Our analysis shows that this particular gene family has independently duplicated in at least five primate lineages, and the duplicated loci are enriched at sites of large-scale chromosomal rearrangements on Chromosome 17. We find that all human copy number variation maps to two distinct clusters located at Chromosome 17q12 and that humans are highly structurally variable at this locus, differing by as many as 20 copies and ~1 Mbp in length depending on haplotypes. We also show evidence of positive selection, as well as a significant change in the predicted human TBC1D3 protein sequence. Lastly, we find that, despite multiple duplications, human TBC1D3 expression is limited to a subset of copies and, most notably, from a single paralog group: TBC1D3-CDKL These observations may help explain why a gene potentially important in cortical development can be so variable in the human population.

2.
bioRxiv ; 2024 Aug 05.
Artigo em Inglês | MEDLINE | ID: mdl-39149261

RESUMO

Using five complementary short- and long-read sequencing technologies, we phased and assembled >95% of each diploid human genome in a four-generation, 28-member family (CEPH 1463) allowing us to systematically assess de novo mutations (DNMs) and recombination. From this family, we estimate an average of 192 DNMs per generation, including 75.5 de novo single-nucleotide variants (SNVs), 7.4 non-tandem repeat indels, 79.6 de novo indels or structural variants (SVs) originating from tandem repeats, 7.7 centromeric de novo SVs and SNVs, and 12.4 de novo Y chromosome events per generation. STRs and VNTRs are the most mutable with 32 loci exhibiting recurrent mutation through the generations. We accurately assemble 288 centromeres and six Y chromosomes across the generations, documenting de novo SVs, and demonstrate that the DNM rate varies by an order of magnitude depending on repeat content, length, and sequence identity. We show a strong paternal bias (75-81%) for all forms of germline DNM, yet we estimate that 17% of de novo SNVs are postzygotic in origin with no paternal bias. We place all this variation in the context of a high-resolution recombination map (~3.5 kbp breakpoint resolution). We observe a strong maternal recombination bias (1.36 maternal:paternal ratio) with a consistent reduction in the number of crossovers with increasing paternal (r=0.85) and maternal (r=0.65) age. However, we observe no correlation between meiotic crossover locations and de novo SVs, arguing against non-allelic homologous recombination as a predominant mechanism. The use of multiple orthogonal technologies, near-telomere-to-telomere phased genome assemblies, and a multi-generation family to assess transmission has created the most comprehensive, publicly available "truth set" of all classes of genomic variants. The resource can be used to test and benchmark new algorithms and technologies to understand the most fundamental processes underlying human genetic variation.

3.
Am J Hum Genet ; 111(8): 1700-1716, 2024 Aug 08.
Artigo em Inglês | MEDLINE | ID: mdl-38991590

RESUMO

The secreted mucins MUC5AC and MUC5B are large glycoproteins that play critical defensive roles in pathogen entrapment and mucociliary clearance. Their respective genes contain polymorphic and degenerate protein-coding variable number tandem repeats (VNTRs) that make the loci difficult to investigate with short reads. We characterize the structural diversity of MUC5AC and MUC5B by long-read sequencing and assembly of 206 human and 20 nonhuman primate (NHP) haplotypes. We find that human MUC5B is largely invariant (5,761-5,762 amino acids [aa]); however, seven haplotypes have expanded VNTRs (6,291-7,019 aa). In contrast, 30 allelic variants of MUC5AC encode 16 distinct proteins (5,249-6,325 aa) with cysteine-rich domain and VNTR copy-number variation. We group MUC5AC alleles into three phylogenetic clades: H1 (46%, ∼5,654 aa), H2 (33%, ∼5,742 aa), and H3 (7%, ∼6,325 aa). The two most common human MUC5AC variants are smaller than NHP gene models, suggesting a reduction in protein length during recent human evolution. Linkage disequilibrium and Tajima's D analyses reveal that East Asians carry exceptionally large blocks with an excess of rare variation (p < 0.05) at MUC5AC. To validate this result, we use Locityper for genotyping MUC5AC haplogroups in 2,600 unrelated samples from the 1000 Genomes Project. We observe a signature of positive selection in H1 among East Asians and a depletion of the likely ancestral haplogroup (H3). In Europeans, H3 alleles show an excess of common variation and deviate from Hardy-Weinberg equilibrium (p < 0.05), consistent with heterozygote advantage and balancing selection. This study provides a generalizable strategy to characterize complex protein-coding VNTRs for improved disease associations.


Assuntos
Alelos , Variação Genética , Haplótipos , Repetições Minissatélites , Mucina-5AC , Mucina-5B , Filogenia , Humanos , Mucina-5B/genética , Animais , Mucina-5AC/genética , Mucina-5AC/metabolismo , Repetições Minissatélites/genética , Variações do Número de Cópias de DNA , Primatas/genética
4.
Autism Res ; 2024 Jul 30.
Artigo em Inglês | MEDLINE | ID: mdl-39080977

RESUMO

This preliminary study sought to assess biomarkers of attention using electroencephalography (EEG) and eye tracking in two ultra-rare monogenic populations associated with autism spectrum disorder (ASD). Relative to idiopathic ASD (n = 12) and neurotypical comparison (n = 49) groups, divergent attention profiles were observed for the monogenic groups, such that individuals with DYRK1A (n = 9) exhibited diminished auditory attention condition differences during an oddball EEG paradigm whereas individuals with SCN2A (n = 5) exhibited diminished visual attention condition differences noted by eye gaze tracking when viewing social interactions. Findings provide initial support for alignment of auditory and visual attention markers in idiopathic ASD and neurotypical development but not monogenic groups. These results support ongoing efforts to develop translational ASD biomarkers within the attention domain.

5.
bioRxiv ; 2024 Jun 06.
Artigo em Inglês | MEDLINE | ID: mdl-38895457

RESUMO

Segmental duplications (SDs) contribute significantly to human disease, evolution, and diversity yet have been difficult to resolve at the sequence level. We present a population genetics survey of SDs by analyzing 170 human genome assemblies where the majority of SDs are fully resolved using long-read sequence assembly. Excluding the acrocentric short arms, we identify 173.2 Mbp of duplicated sequence (47.4 Mbp not present in the telomere-to-telomere reference) distinguishing fixed from structurally polymorphic events. We find that intrachromosomal SDs are among the most variable with rare events mapping near their progenitor sequences. African genomes harbor significantly more intrachromosomal SDs and are more likely to have recently duplicated gene families with higher copy number when compared to non-African samples. A comparison to a resource of 563 million full-length Iso-Seq reads identifies 201 novel, potentially protein-coding genes corresponding to these copy number polymorphic SDs.

6.
J Autism Dev Disord ; 2024 May 29.
Artigo em Inglês | MEDLINE | ID: mdl-38809474

RESUMO

Specialized multidisciplinary supports are important for long-term outcomes for autistic youth. Although family and child factors predict service utilization in autism, little is known with respect to youth with rare, autism-associated genetic variants, who frequently have increased psychiatric, developmental, and behavioral needs. We investigate the impact of family factors on service utilization to determine whether caregiver (autistic features, education, income) and child (autistic features, sex, age, IQ, co-occurring conditions) factors predicted service type (e.g., speech, occupational, behavioral) and intensity (hours/year) among children with autism-associated variants (N = 125), some of whom also had a confirmed ASD diagnosis. Analyses revealed variability in the types of services used across a range of child demographic, behavioral, and mental health characteristics. Speech therapy was the most received service (87.2%). Importantly, behavior therapy was the least received service and post-hoc analyses revealed that use of this therapy was uniquely predicted by ASD diagnosis. However, once children received a particular service, there was largely comparable intensity of services, independent of caregiver and child factors. Findings suggest that demographic and clinical factors impact families' ability to obtain services, with less impact on the intensity of services received. The low receipt of therapies that specifically address core support needs in autism (i.e., behavior therapy) indicates more research is needed on the availability of these services for youth with autism-associated variants, particularly for those who do not meet criteria for an ASD diagnosis but do demonstrate elevated and impactful child autistic features as compared to the general population.

7.
Mol Autism ; 15(1): 22, 2024 05 25.
Artigo em Inglês | MEDLINE | ID: mdl-38790065

RESUMO

BACKGROUND: Social affective and communication symptoms are central to autism spectrum disorder (ASD), yet their severity differs across toddlers: Some toddlers with ASD display improving abilities across early ages and develop good social and language skills, while others with "profound" autism have persistently low social, language and cognitive skills and require lifelong care. The biological origins of these opposite ASD social severity subtypes and developmental trajectories are not known. METHODS: Because ASD involves early brain overgrowth and excess neurons, we measured size and growth in 4910 embryonic-stage brain cortical organoids (BCOs) from a total of 10 toddlers with ASD and 6 controls (averaging 196 individual BCOs measured/subject). In a 2021 batch, we measured BCOs from 10 ASD and 5 controls. In a 2022 batch, we  tested replicability of BCO size and growth effects by generating and measuring an independent batch of BCOs from 6 ASD and 4 control subjects. BCO size was analyzed within the context of our large, one-of-a-kind social symptom, social attention, social brain and social and language psychometric normative datasets ranging from N = 266 to N = 1902 toddlers. BCO growth rates were examined by measuring size changes between 1- and 2-months of organoid development. Neurogenesis markers at 2-months were examined at the cellular level. At the molecular level, we measured activity and expression of Ndel1; Ndel1 is a prime target for cell cycle-activated kinases; known to regulate cell cycle, proliferation, neurogenesis, and growth; and known to be involved in neuropsychiatric conditions. RESULTS: At the BCO level, analyses showed BCO size was significantly enlarged by 39% and 41% in ASD in the 2021 and 2022 batches. The larger the embryonic BCO size, the more severe the ASD social symptoms. Correlations between BCO size and social symptoms were r = 0.719 in the 2021 batch and r = 0. 873 in the replication 2022 batch. ASD BCOs grew at an accelerated rate nearly 3 times faster than controls. At the cell level, the two largest ASD BCOs had accelerated neurogenesis. At the molecular level, Ndel1 activity was highly correlated with the growth rate and size of BCOs. Two BCO subtypes were found in ASD toddlers: Those in one subtype had very enlarged BCO size with accelerated rate of growth and neurogenesis; a profound autism clinical phenotype displaying severe social symptoms, reduced social attention, reduced cognitive, very low language and social IQ; and substantially altered growth in specific cortical social, language and sensory regions. Those in a second subtype had milder BCO enlargement and milder social, attention, cognitive, language and cortical differences. LIMITATIONS: Larger samples of ASD toddler-derived BCO and clinical phenotypes may reveal additional ASD embryonic subtypes. CONCLUSIONS: By embryogenesis, the biological bases of two subtypes of ASD social and brain development-profound autism and mild autism-are already present and measurable and involve dysregulated cell proliferation and accelerated neurogenesis and growth. The larger the embryonic BCO size in ASD, the more severe the toddler's social symptoms and the more reduced the social attention, language ability, and IQ, and the more atypical the growth of social and language brain regions.


Assuntos
Transtorno do Espectro Autista , Organoides , Humanos , Transtorno do Espectro Autista/patologia , Transtorno do Espectro Autista/fisiopatologia , Organoides/patologia , Masculino , Feminino , Pré-Escolar , Córtex Cerebral/patologia , Comportamento Social , Tamanho do Órgão , Lactente , Índice de Gravidade de Doença , Encéfalo/patologia
8.
J Neurodev Disord ; 16(1): 15, 2024 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-38622540

RESUMO

BACKGROUND: Neurodevelopmental conditions such as intellectual disability (ID) and autism spectrum disorder (ASD) can stem from a broad array of inherited and de novo genetic differences, with marked physiological and behavioral impacts. We currently know little about the psychiatric phenotypes of rare genetic variants associated with ASD, despite heightened risk of psychiatric concerns in ASD more broadly. Understanding behavioral features of these variants can identify shared versus specific phenotypes across gene groups, facilitate mechanistic models, and provide prognostic insights to inform clinical practice. In this paper, we evaluate behavioral features within three gene groups associated with ID and ASD - ADNP, CHD8, and DYRK1A - with two aims: (1) characterize phenotypes across behavioral domains of anxiety, depression, ADHD, and challenging behavior; and (2) understand whether age and early developmental milestones are associated with later mental health outcomes. METHODS: Phenotypic data were obtained for youth with disruptive variants in ADNP, CHD8, or DYRK1A (N = 65, mean age = 8.7 years, 40% female) within a long-running, genetics-first study. Standardized caregiver-report measures of mental health features (anxiety, depression, attention-deficit/hyperactivity, oppositional behavior) and developmental history were extracted and analyzed for effects of gene group, age, and early developmental milestones on mental health features. RESULTS: Patterns of mental health features varied by group, with anxiety most prominent for CHD8, oppositional features overrepresented among ADNP, and attentional and depressive features most prominent for DYRK1A. For the full sample, age was positively associated with anxiety features, such that elevations in anxiety relative to same-age and same-sex peers may worsen with increasing age. Predictive utility of early developmental milestones was limited, with evidence of early language delays predicting greater difficulties across behavioral domains only for the CHD8 group. CONCLUSIONS: Despite shared associations with autism and intellectual disability, disruptive variants in ADNP, CHD8, and DYRK1A may yield variable psychiatric phenotypes among children and adolescents. With replication in larger samples over time, efforts such as these may contribute to improved clinical care for affected children and adolescents, allow for earlier identification of emerging mental health difficulties, and promote early intervention to alleviate concerns and improve quality of life.


Assuntos
Transtorno do Espectro Autista , Deficiência Intelectual , Transtornos do Neurodesenvolvimento , Adolescente , Criança , Feminino , Humanos , Masculino , Transtorno do Espectro Autista/complicações , Proteínas de Ligação a DNA/genética , Proteínas de Homeodomínio/genética , Deficiência Intelectual/genética , Deficiência Intelectual/complicações , Saúde Mental , Proteínas do Tecido Nervoso/genética , Transtornos do Neurodesenvolvimento/genética , Transtornos do Neurodesenvolvimento/complicações , Qualidade de Vida , Fatores de Transcrição/genética
9.
bioRxiv ; 2024 Apr 08.
Artigo em Inglês | MEDLINE | ID: mdl-38645259

RESUMO

The crab-eating macaques ( Macaca fascicularis ) and rhesus macaques ( M. mulatta ) are widely studied nonhuman primates in biomedical and evolutionary research. Despite their significance, the current understanding of the complex genomic structure in macaques and the differences between species requires substantial improvement. Here, we present a complete genome assembly of a crab-eating macaque and 20 haplotype-resolved macaque assemblies to investigate the complex regions and major genomic differences between species. Segmental duplication in macaques is ∼42% lower, while centromeres are ∼3.7 times longer than those in humans. The characterization of ∼2 Mbp fixed genetic variants and ∼240 Mbp complex loci highlights potential associations with metabolic differences between the two macaque species (e.g., CYP2C76 and EHBP1L1 ). Additionally, hundreds of alternative splicing differences show post-transcriptional regulation divergence between these two species (e.g., PNPO ). We also characterize 91 large-scale genomic differences between macaques and humans at a single-base-pair resolution and highlight their impact on gene regulation in primate evolution (e.g., FOLH1 and PIEZO2 ). Finally, population genetics recapitulates macaque speciation and selective sweeps, highlighting potential genetic basis of reproduction and tail phenotype differences (e.g., STAB1 , SEMA3F , and HOXD13 ). In summary, the integrated analysis of genetic variation and population genetics in macaques greatly enhances our comprehension of lineage-specific phenotypes, adaptation, and primate evolution, thereby improving their biomedical applications in human diseases.

10.
Nature ; 629(8010): 136-145, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38570684

RESUMO

Human centromeres have been traditionally very difficult to sequence and assemble owing to their repetitive nature and large size1. As a result, patterns of human centromeric variation and models for their evolution and function remain incomplete, despite centromeres being among the most rapidly mutating regions2,3. Here, using long-read sequencing, we completely sequenced and assembled all centromeres from a second human genome and compared it to the finished reference genome4,5. We find that the two sets of centromeres show at least a 4.1-fold increase in single-nucleotide variation when compared with their unique flanks and vary up to 3-fold in size. Moreover, we find that 45.8% of centromeric sequence cannot be reliably aligned using standard methods owing to the emergence of new α-satellite higher-order repeats (HORs). DNA methylation and CENP-A chromatin immunoprecipitation experiments show that 26% of the centromeres differ in their kinetochore position by >500 kb. To understand evolutionary change, we selected six chromosomes and sequenced and assembled 31 orthologous centromeres from the common chimpanzee, orangutan and macaque genomes. Comparative analyses reveal a nearly complete turnover of α-satellite HORs, with characteristic idiosyncratic changes in α-satellite HORs for each species. Phylogenetic reconstruction of human haplotypes supports limited to no recombination between the short (p) and long (q) arms across centromeres and reveals that novel α-satellite HORs share a monophyletic origin, providing a strategy to estimate the rate of saltatory amplification and mutation of human centromeric DNA.


Assuntos
Centrômero , Evolução Molecular , Variação Genética , Animais , Humanos , Centrômero/genética , Centrômero/metabolismo , Proteína Centromérica A/metabolismo , Metilação de DNA/genética , DNA Satélite/genética , Cinetocoros/metabolismo , Macaca/genética , Pan troglodytes/genética , Polimorfismo de Nucleotídeo Único/genética , Pongo/genética , Masculino , Feminino , Padrões de Referência , Imunoprecipitação da Cromatina , Haplótipos , Mutação , Amplificação de Genes , Alinhamento de Sequência , Cromatina/genética , Cromatina/metabolismo , Especificidade da Espécie
11.
bioRxiv ; 2024 Mar 13.
Artigo em Inglês | MEDLINE | ID: mdl-38654825

RESUMO

TBC1D3 is a primate-specific gene family that has expanded in the human lineage and has been implicated in neuronal progenitor proliferation and expansion of the frontal cortex. The gene family and its expression have been challenging to investigate because it is embedded in high-identity and highly variable segmental duplications. We sequenced and assembled the gene family using long-read sequencing data from 34 humans and 11 nonhuman primate species. Our analysis shows that this particular gene family has independently duplicated in at least five primate lineages, and the duplicated loci are enriched at sites of large-scale chromosomal rearrangements on chromosome 17. We find that most humans vary along two TBC1D3 clusters where human haplotypes are highly variable in copy number, differing by as many as 20 copies, and structure (structural heterozygosity 90%). We also show evidence of positive selection, as well as a significant change in the predicted human TBC1D3 protein sequence. Lastly, we find that, despite multiple duplications, human TBC1D3 expression is limited to a subset of copies and, most notably, from a single paralog group: TBC1D3-CDKL. These observations may help explain why a gene potentially important in cortical development can be so variable in the human population.

12.
bioRxiv ; 2024 Mar 20.
Artigo em Inglês | MEDLINE | ID: mdl-38562829

RESUMO

The secreted mucins MUC5AC and MUC5B play critical defensive roles in airway pathogen entrapment and mucociliary clearance by encoding large glycoproteins with variable number tandem repeats (VNTRs). These polymorphic and degenerate protein coding VNTRs make the loci difficult to investigate with short reads. We characterize the structural diversity of MUC5AC and MUC5B by long-read sequencing and assembly of 206 human and 20 nonhuman primate (NHP) haplotypes. We find that human MUC5B is largely invariant (5761-5762aa); however, seven haplotypes have expanded VNTRs (6291-7019aa). In contrast, 30 allelic variants of MUC5AC encode 16 distinct proteins (5249-6325aa) with cysteine-rich domain and VNTR copy number variation. We grouped MUC5AC alleles into three phylogenetic clades: H1 (46%, ~5654aa), H2 (33%, ~5742aa), and H3 (7%, ~6325aa). The two most common human MUC5AC variants are smaller than NHP gene models, suggesting a reduction in protein length during recent human evolution. Linkage disequilibrium (LD) and Tajima's D analyses reveal that East Asians carry exceptionally large MUC5AC LD blocks with an excess of rare variation (p<0.05). To validate this result, we used Locityper for genotyping MUC5AC haplogroups in 2,600 unrelated samples from the 1000 Genomes Project. We observed signatures of positive selection in H1 and H2 among East Asians and a depletion of the likely ancestral haplogroup (H3). In Africans and Europeans, H3 alleles show an excess of common variation and deviate from Hardy-Weinberg equilibrium, consistent with heterozygote advantage and balancing selection. This study provides a generalizable strategy to characterize complex protein coding VNTRs for improved disease associations.

13.
Genome Res ; 34(3): 454-468, 2024 04 25.
Artigo em Inglês | MEDLINE | ID: mdl-38627094

RESUMO

Reference-free genome phasing is vital for understanding allele inheritance and the impact of single-molecule DNA variation on phenotypes. To achieve thorough phasing across homozygous or repetitive regions of the genome, long-read sequencing technologies are often used to perform phased de novo assembly. As a step toward reducing the cost and complexity of this type of analysis, we describe new methods for accurately phasing Oxford Nanopore Technologies (ONT) sequence data with the Shasta genome assembler and a modular tool for extending phasing to the chromosome scale called GFAse. We test using new variants of ONT PromethION sequencing, including those using proximity ligation, and show that newer, higher accuracy ONT reads substantially improve assembly quality.


Assuntos
Nanoporos , Humanos , Análise de Sequência de DNA/métodos , Sequenciamento por Nanoporos/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Software , Genômica/métodos
14.
bioRxiv ; 2024 Feb 26.
Artigo em Inglês | MEDLINE | ID: mdl-38464314

RESUMO

Down syndrome is the most common form of human intellectual disability caused by precocious segregation and nondisjunction of chromosome 21. Differences in centromere structure have been hypothesized to play a potential role in this process in addition to the well-established risk of advancing maternal age. Using long-read sequencing, we completely sequenced and assembled the centromeres from a parent-child trio where Trisomy 21 arose in the child as a result of a meiosis I error. The proband carries three distinct chromosome 21 centromere haplotypes that vary by 11-fold in length--both the largest (H1) and smallest (H2) originating from the mother. The longest H1 allele harbors a less clearly defined centromere dip region (CDR) as defined by CpG methylation and a significantly reduced signal by CENP-A chromatin immunoprecipitation sequencing when compared to H2 or paternal H3 centromeres. These epigenetic signatures suggest less competent kinetochore attachment for the maternally transmitted H1. Analysis of H1 in the mother indicates that the reduced CENP-A ChIP-seq signal, but not the CDR profile, pre-existed the meiotic nondisjunction event. A comparison of the three proband centromeres to a population sampling of 35 completely sequenced chromosome 21 centromeres shows that H2 is the smallest centromere sequenced to date and all three haplotypes (H1-H3) share a common origin of ~15 thousand years ago. These results suggest that recent asymmetry in size and epigenetic differences of chromosome 21 centromeres may contribute to nondisjunction risk.

15.
medRxiv ; 2024 Mar 07.
Artigo em Inglês | MEDLINE | ID: mdl-38496498

RESUMO

Less than half of individuals with a suspected Mendelian condition receive a precise molecular diagnosis after comprehensive clinical genetic testing. Improvements in data quality and costs have heightened interest in using long-read sequencing (LRS) to streamline clinical genomic testing, but the absence of control datasets for variant filtering and prioritization has made tertiary analysis of LRS data challenging. To address this, the 1000 Genomes Project ONT Sequencing Consortium aims to generate LRS data from at least 800 of the 1000 Genomes Project samples. Our goal is to use LRS to identify a broader spectrum of variation so we may improve our understanding of normal patterns of human variation. Here, we present data from analysis of the first 100 samples, representing all 5 superpopulations and 19 subpopulations. These samples, sequenced to an average depth of coverage of 37x and sequence read N50 of 54 kbp, have high concordance with previous studies for identifying single nucleotide and indel variants outside of homopolymer regions. Using multiple structural variant (SV) callers, we identify an average of 24,543 high-confidence SVs per genome, including shared and private SVs likely to disrupt gene function as well as pathogenic expansions within disease-associated repeats that were not detected using short reads. Evaluation of methylation signatures revealed expected patterns at known imprinted loci, samples with skewed X-inactivation patterns, and novel differentially methylated regions. All raw sequencing data, processed data, and summary statistics are publicly available, providing a valuable resource for the clinical genetics community to discover pathogenic SVs.

16.
bioRxiv ; 2024 Jun 20.
Artigo em Inglês | MEDLINE | ID: mdl-38529499

RESUMO

Haplotype information is crucial for biomedical and population genetics research. However, current strategies to produce de-novo haplotype-resolved assemblies often require either difficult-to-acquire parental data or an intermediate haplotype-collapsed assembly. Here, we present Graphasing, a workflow which synthesizes the global phase signal of Strand-seq with assembly graph topology to produce chromosome-scale de-novo haplotypes for diploid genomes. Graphasing readily integrates with any assembly workflow that both outputs an assembly graph and has a haplotype assembly mode. Graphasing performs comparably to trio-phasing in contiguity, phasing accuracy, and assembly quality, outperforms Hi-C in phasing accuracy, and generates human assemblies with over 18 chromosome-spanning haplotypes.

17.
Cell ; 187(6): 1547-1562.e13, 2024 Mar 14.
Artigo em Inglês | MEDLINE | ID: mdl-38428424

RESUMO

We sequenced and assembled using multiple long-read sequencing technologies the genomes of chimpanzee, bonobo, gorilla, orangutan, gibbon, macaque, owl monkey, and marmoset. We identified 1,338,997 lineage-specific fixed structural variants (SVs) disrupting 1,561 protein-coding genes and 136,932 regulatory elements, including the most complete set of human-specific fixed differences. We estimate that 819.47 Mbp or ∼27% of the genome has been affected by SVs across primate evolution. We identify 1,607 structurally divergent regions wherein recurrent structural variation contributes to creating SV hotspots where genes are recurrently lost (e.g., CARD, C4, and OLAH gene families) and additional lineage-specific genes are generated (e.g., CKAP2, VPS36, ACBD7, and NEK5 paralogs), becoming targets of rapid chromosomal diversification and positive selection (e.g., RGPD gene family). High-fidelity long-read sequencing has made these dynamic regions of the genome accessible for sequence-level analyses within and between primate species.


Assuntos
Genoma , Primatas , Animais , Humanos , Sequência de Bases , Primatas/classificação , Primatas/genética , Evolução Biológica , Análise de Sequência de DNA , Variação Estrutural do Genoma
18.
Am J Med Genet A ; 194(6): e63514, 2024 06.
Artigo em Inglês | MEDLINE | ID: mdl-38329159

RESUMO

Genetics has become a critical component of medicine over the past five to six decades. Alongside genetics, a relatively new discipline, dysmorphology, has also begun to play an important role in providing critically important diagnoses to individuals and families. Both have become indispensable to unraveling rare diseases. Almost every medical specialty relies on individuals experienced in these specialties to provide diagnoses for patients who present themselves to other doctors. Additionally, both specialties have become reliant on molecular geneticists to identify genes associated with human disorders. Many of the medical geneticists, dysmorphologists, and molecular geneticists traveled a circuitous route before arriving at the position they occupied. The purpose of collecting the memoirs contained in this article was to convey to the reader that many of the individuals who contributed to the advancement of genetics and dysmorphology since the late 1960s/early 1970s traveled along a journey based on many chances taken, replying to the necessities they faced along the way before finding full enjoyment in the practice of medical and human genetics or dysmorphology. Additionally, and of equal importance, all exhibited an ability to evolve with their field of expertise as human genetics became human genomics with the development of novel technologies.


Assuntos
Genética Médica , Humanos , História do Século XX , História do Século XXI , Genética Humana
19.
Genetics ; 226(4)2024 04 03.
Artigo em Inglês | MEDLINE | ID: mdl-38298127

RESUMO

Short tandem repeats (STRs) are hotspots of genomic variability in the human germline because of their high mutation rates, which have long been attributed largely to polymerase slippage during DNA replication. This model suggests that STR mutation rates should scale linearly with a father's age, as progenitor cells continually divide after puberty. In contrast, it suggests that STR mutation rates should not scale with a mother's age at her child's conception, since oocytes spend a mother's reproductive years arrested in meiosis II and undergo a fixed number of cell divisions that are independent of the age at ovulation. Yet, mirroring recent findings, we find that STR mutation rates covary with paternal and maternal age, implying that some STR mutations are caused by DNA damage in quiescent cells rather than polymerase slippage in replicating progenitor cells. These results echo the recent finding that DNA damage in oocytes is a significant source of de novo single nucleotide variants and corroborate evidence of STR expansion in postmitotic cells. However, we find that the maternal age effect is not confined to known hotspots of oocyte mutagenesis, nor are postzygotic mutations likely to contribute significantly. STR nucleotide composition demonstrates divergent effects on de novo mutation (DNM) rates between sexes. Unlike the paternal lineage, maternally derived DNMs at A/T STRs display a significantly greater association with maternal age than DNMs at G/C-containing STRs. These observations may suggest the mechanism and developmental timing of certain STR mutations and contradict prior attribution of replication slippage as the primary mechanism of STR mutagenesis.


Assuntos
Repetições de Microssatélites , Taxa de Mutação , Humanos , Feminino , Criança , Mutação , Pais , Meiose , Nucleotídeos
20.
Cell ; 187(5): 1024-1037, 2024 Feb 29.
Artigo em Inglês | MEDLINE | ID: mdl-38290514

RESUMO

This perspective focuses on advances in genome technology over the last 25 years and their impact on germline variant discovery within the field of human genetics. The field has witnessed tremendous technological advances from microarrays to short-read sequencing and now long-read sequencing. Each technology has provided genome-wide access to different classes of human genetic variation. We are now on the verge of comprehensive variant detection of all forms of variation for the first time with a single assay. We predict that this transition will further transform our understanding of human health and biology and, more importantly, provide novel insights into the dynamic mutational processes shaping our genomes.


Assuntos
Variação Estrutural do Genoma , Genômica , Humanos , Genômica/métodos , Mutação em Linhagem Germinativa , Mutação , Tecnologia
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA