Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 47
Filtrar
1.
Nat Biotechnol ; 2024 Apr 26.
Artigo em Inglês | MEDLINE | ID: mdl-38671154

RESUMO

Tandem repeats (TRs) are highly polymorphic in the human genome, have thousands of associated molecular traits and are linked to over 60 disease phenotypes. However, they are often excluded from at-scale studies because of challenges with variant calling and representation, as well as a lack of a genome-wide standard. Here, to promote the development of TR methods, we created a catalog of TR regions and explored TR properties across 86 haplotype-resolved long-read human assemblies. We curated variants from the Genome in a Bottle (GIAB) HG002 individual to create a TR dataset to benchmark existing and future TR analysis methods. We also present an improved variant comparison method that handles variants greater than 4 bp in length and varying allelic representation. The 8.1% of the genome covered by the TR catalog holds ~24.9% of variants per individual, including 124,728 small and 17,988 large variants for the GIAB HG002 'truth-set' TR benchmark. We demonstrate the utility of this pipeline across short-read and long-read technologies.

2.
medRxiv ; 2024 Mar 18.
Artigo em Inglês | MEDLINE | ID: mdl-38562723

RESUMO

Comprehending the mechanism behind human diseases with an established heritable component represents the forefront of personalized medicine. Nevertheless, numerous medically important genes are inaccurately represented in short-read sequencing data analysis due to their complexity and repetitiveness or the so-called 'dark regions' of the human genome. The advent of PacBio as a long-read platform has provided new insights, yet HiFi whole-genome sequencing (WGS) cost remains frequently prohibitive. We introduce a targeted sequencing and analysis framework, Twist Alliance Dark Genes Panel (TADGP), designed to offer phased variants across 389 medically important yet complex autosomal genes. We highlight TADGP accuracy across eleven control samples and compare it to WGS. This demonstrates that TADGP achieves variant calling accuracy comparable to HiFi-WGS data, but at a fraction of the cost. Thus, enabling scalability and broad applicability for studying rare diseases or complementing previously sequenced samples to gain insights into these complex genes. TADGP revealed several candidate variants across all cases and provided insight into LPA diversity when tested on samples from rare disease and cardiovascular disease cohorts. In both cohorts, we identified novel variants affecting individual disease-associated genes (e.g., IKZF1, KCNE1). Nevertheless, the annotation of the variants across these 389 medically important genes remains challenging due to their underrepresentation in ClinVar and gnomAD. Consequently, we also offer an annotation resource to enhance the evaluation and prioritization of these variants. Overall, we can demonstrate that TADGP offers a cost-efficient and scalable approach to routinely assess the dark regions of the human genome with clinical relevance.

3.
Nat Rev Genet ; 2024 Mar 11.
Artigo em Inglês | MEDLINE | ID: mdl-38467784

RESUMO

Short tandem repeats (STRs) are a class of repetitive elements, composed of tandem arrays of 1-6 base pair sequence motifs, that comprise a substantial fraction of the human genome. STR expansions can cause a wide range of neurological and neuromuscular conditions, known as repeat expansion disorders, whose age of onset, severity, penetrance and/or clinical phenotype are influenced by the length of the repeats and their sequence composition. The presence of non-canonical motifs, depending on the type, frequency and position within the repeat tract, can alter clinical outcomes by modifying somatic and intergenerational repeat stability, gene expression and mutant transcript-mediated and/or protein-mediated toxicities. Here, we review the diverse structural conformations of repeat expansions, technological advances for the characterization of changes in sequence composition, their clinical correlations and the impact on disease mechanisms.

4.
Nat Biotechnol ; 2024 Jan 02.
Artigo em Inglês | MEDLINE | ID: mdl-38168995

RESUMO

Tandem repeat (TR) variation is associated with gene expression changes and numerous rare monogenic diseases. Although long-read sequencing provides accurate full-length sequences and methylation of TRs, there is still a need for computational methods to profile TRs across the genome. Here we introduce the Tandem Repeat Genotyping Tool (TRGT) and an accompanying TR database. TRGT determines the consensus sequences and methylation levels of specified TRs from PacBio HiFi sequencing data. It also reports reads that support each repeat allele. These reads can be subsequently visualized with a companion TR visualization tool. Assessing 937,122 TRs, TRGT showed a Mendelian concordance of 98.38%, allowing a single repeat unit difference. In six samples with known repeat expansions, TRGT detected all expansions while also identifying methylation signals and mosaicism and providing finer repeat length resolution than existing methods. Additionally, we released a database with allele sequences and methylation levels for 937,122 TRs across 100 genomes.

5.
bioRxiv ; 2023 Nov 01.
Artigo em Inglês | MEDLINE | ID: mdl-37961319

RESUMO

Tandem repeats (TRs) are highly polymorphic in the human genome, have thousands of associated molecular traits, and are linked to over 60 disease phenotypes. However, their complexity often excludes them from at-scale studies due to challenges with variant calling, representation, and lack of a genome-wide standard. To promote TR methods development, we create a comprehensive catalog of TR regions and explore its properties across 86 samples. We then curate variants from the GIAB HG002 individual to create a tandem repeat benchmark. We also present a variant comparison method that handles small and large alleles and varying allelic representation. The 8.1% of the genome covered by the TR catalog holds ∼24.9% of variants per individual, including 124,728 small and 17,988 large variants for the GIAB HG002 TR benchmark. We work with the GIAB community to demonstrate the utility of this benchmark across short and long read technologies.

6.
bioRxiv ; 2023 Jun 30.
Artigo em Inglês | MEDLINE | ID: mdl-37425777

RESUMO

The factors driving initiation of pathological expansion of tandem repeats remain largely unknown. Here, we assessed the FGF14 -SCA27B (GAA)•(TTC) repeat locus in 2,530 individuals by long-read and Sanger sequencing and identified a 5'-flanking 17-bp deletion-insertion in 70.34% of alleles (3,463/4,923). This common sequence variation was present nearly exclusively on alleles with fewer than 30 GAA-pure repeats and was associated with enhanced meiotic stability of the repeat locus.

8.
Am J Hum Genet ; 110(2): 240-250, 2023 02 02.
Artigo em Inglês | MEDLINE | ID: mdl-36669496

RESUMO

Spinal muscular atrophy, a leading cause of early infant death, is caused by bi-allelic mutations of SMN1. Sequence analysis of SMN1 is challenging due to high sequence similarity with its paralog SMN2. Both genes have variable copy numbers across populations. Furthermore, without pedigree information, it is currently not possible to identify silent carriers (2+0) with two copies of SMN1 on one chromosome and zero copies on the other. We developed Paraphase, an informatics method that identifies full-length SMN1 and SMN2 haplotypes, determines the gene copy numbers, and calls phased variants using long-read PacBio HiFi data. The SMN1 and SMN2 copy-number calls by Paraphase are highly concordant with orthogonal methods (99.2% for SMN1 and 100% for SMN2). We applied Paraphase to 438 samples across 5 ethnic populations to conduct a population-wide haplotype analysis of these highly homologous genes. We identified major SMN1 and SMN2 haplogroups and characterized their co-segregation through pedigree-based analyses. We identified two SMN1 haplotypes that form a common two-copy SMN1 allele in African populations. Testing positive for these two haplotypes in an individual with two copies of SMN1 gives a silent carrier risk of 88.5%, which is significantly higher than the currently used marker (1.7%-3.0%). Extending beyond simple copy-number testing, Paraphase can detect pathogenic variants and enable potential haplotype-based screening of silent carriers through statistical phasing of haplotypes into alleles. Future analysis of larger population data will allow identification of more diverse haplotypes and genetic markers for silent carriers.


Assuntos
Atrofia Muscular Espinal , Lactente , Humanos , Atrofia Muscular Espinal/genética , Atrofia Muscular Espinal/diagnóstico , Mutação , Dosagem de Genes , Linhagem , Análise de Sequência , Proteína 1 de Sobrevivência do Neurônio Motor/genética , Proteína 2 de Sobrevivência do Neurônio Motor/genética
9.
Nature ; 613(7942): 96-102, 2023 01.
Artigo em Inglês | MEDLINE | ID: mdl-36517591

RESUMO

Expansion of a single repetitive DNA sequence, termed a tandem repeat (TR), is known to cause more than 50 diseases1,2. However, repeat expansions are often not explored beyond neurological and neurodegenerative disorders. In some cancers, mutations accumulate in short tracts of TRs, a phenomenon termed microsatellite instability; however, larger repeat expansions have not been systematically analysed in cancer3-8. Here we identified TR expansions in 2,622 cancer genomes spanning 29 cancer types. In seven cancer types, we found 160 recurrent repeat expansions (rREs), most of which (155/160) were subtype specific. We found that rREs were non-uniformly distributed in the genome with enrichment near candidate cis-regulatory elements, suggesting a potential role in gene regulation. One rRE, a GAAA-repeat expansion, located near a regulatory element in the first intron of UGT2B7 was detected in 34% of renal cell carcinoma samples and was validated by long-read DNA sequencing. Moreover, in preliminary experiments, treating cells that harbour this rRE with a GAAA-targeting molecule led to a dose-dependent decrease in cell proliferation. Overall, our results suggest that rREs may be an important but unexplored source of genetic variation in human cancer, and we provide a comprehensive catalogue for further study.


Assuntos
Expansão das Repetições de DNA , Genoma Humano , Neoplasias , Humanos , Sequência de Bases , Expansão das Repetições de DNA/genética , Genoma Humano/genética , Neoplasias/classificação , Neoplasias/genética , Neoplasias/patologia , Análise de Sequência de DNA , Regulação da Expressão Gênica , Elementos Reguladores de Transcrição/genética , Íntrons/genética , Carcinoma de Células Renais/genética , Carcinoma de Células Renais/patologia , Proliferação de Células/efeitos dos fármacos , Reprodutibilidade dos Testes
10.
Am J Hum Genet ; 110(1): 105-119, 2023 01 05.
Artigo em Inglês | MEDLINE | ID: mdl-36493768

RESUMO

Adult-onset cerebellar ataxias are a group of neurodegenerative conditions that challenge both genetic discovery and molecular diagnosis. In this study, we identified an intronic (GAA) repeat expansion in fibroblast growth factor 14 (FGF14). Genetic analysis of 95 Australian individuals with adult-onset ataxia identified four (4.2%) with (GAA)>300 and a further nine individuals with (GAA)>250. PCR and long-read sequence analysis revealed these were pure (GAA) repeats. In comparison, no control subjects had (GAA)>300 and only 2/311 control individuals (0.6%) had a pure (GAA)>250. In a German validation cohort, 9/104 (8.7%) of affected individuals had (GAA)>335 and a further six had (GAA)>250, whereas 10/190 (5.3%) control subjects had (GAA)>250 but none were (GAA)>335. The combined data suggest (GAA)>335 are disease causing and fully penetrant (p = 6.0 × 10-8, OR = 72 [95% CI = 4.3-1,227]), while (GAA)>250 is likely pathogenic with reduced penetrance. Affected individuals had an adult-onset, slowly progressive cerebellar ataxia with variable features including vestibular impairment, hyper-reflexia, and autonomic dysfunction. A negative correlation between age at onset and repeat length was observed (R2 = 0.44, p = 0.00045, slope = -0.12) and identification of a shared haplotype in a minority of individuals suggests that the expansion can be inherited or generated de novo during meiotic division. This study demonstrates the power of genome sequencing and advanced bioinformatic tools to identify novel repeat expansions via model-free, genome-wide analysis and identifies SCA50/ATX-FGF14 as a frequent cause of adult-onset ataxia.


Assuntos
Ataxia Cerebelar , Fatores de Crescimento de Fibroblastos , Ataxia de Friedreich , Expansão das Repetições de Trinucleotídeos , Adulto , Humanos , Ataxia/genética , Austrália , Ataxia Cerebelar/genética , Ataxia de Friedreich/genética , Expansão das Repetições de Trinucleotídeos/genética
12.
Genome Med ; 14(1): 84, 2022 08 11.
Artigo em Inglês | MEDLINE | ID: mdl-35948990

RESUMO

BACKGROUND: Expansions of short tandem repeats are the cause of many neurogenetic disorders including familial amyotrophic lateral sclerosis, Huntington disease, and many others. Multiple methods have been recently developed that can identify repeat expansions in whole genome or exome sequencing data. Despite the widely recognized need for visual assessment of variant calls in clinical settings, current computational tools lack the ability to produce such visualizations for repeat expansions. Expanded repeats are difficult to visualize because they correspond to large insertions relative to the reference genome and involve many misaligning and ambiguously aligning reads. RESULTS: We implemented REViewer, a computational method for visualization of sequencing data in genomic regions containing long repeat expansions and FlipBook, a companion image viewer designed for manual curation of large collections of REViewer images. To generate a read pileup, REViewer reconstructs local haplotype sequences and distributes reads to these haplotypes in a way that is most consistent with the fragment lengths and evenness of read coverage. To create appropriate training materials for onboarding new users, we performed a concordance study involving 12 scientists involved in short tandem repeat research. We used the results of this study to create a user guide that describes the basic principles of using REViewer as well as a guide to the typical features of read pileups that correspond to low confidence repeat genotype calls. Additionally, we demonstrated that REViewer can be used to annotate clinically relevant repeat interruptions by comparing visual assessment results of 44 FMR1 repeat alleles with the results of triplet repeat primed PCR. For 38 of these alleles, the results of visual assessment were consistent with triplet repeat primed PCR. CONCLUSIONS: Read pileup plots generated by REViewer offer an intuitive way to visualize sequencing data in regions containing long repeat expansions. Laboratories can use REViewer and FlipBook to assess the quality of repeat genotype calls as well as to visually detect interruptions or other imperfections in the repeat sequence and the surrounding flanking regions. REViewer and FlipBook are available under open-source licenses at https://github.com/illumina/REViewer and https://github.com/broadinstitute/flipbook respectively.


Assuntos
Esclerose Lateral Amiotrófica , Sequências de Repetição em Tandem , Alelos , Esclerose Lateral Amiotrófica/genética , Exoma , Proteína do X Frágil da Deficiência Intelectual/genética , Haplótipos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos
13.
Commun Biol ; 5(1): 670, 2022 07 06.
Artigo em Inglês | MEDLINE | ID: mdl-35794204

RESUMO

GBA variants carriers are at increased risk of Parkinson's disease (PD) and Lewy body dementia (LBD). The presence of pseudogene GBAP1 predisposes to structural variants, complicating genetic analysis. We present two methods to resolve recombinant alleles and other variants in GBA: Gauchian, a tool for short-read, whole-genome sequencing data analysis, and Oxford Nanopore sequencing after PCR enrichment. Both methods were concordant for 42 samples carrying a range of recombinants and GBAP1-related mutations, and Gauchian outperformed the GATK Best Practices pipeline. Applying Gauchian to sequencing of over 10,000 individuals shows that copy number variants (CNVs) spanning GBAP1 are relatively common in Africans. CNV frequencies in PD and LBD are similar to controls. Gains may coexist with other mutations in patients, and a modifying effect cannot be excluded. Gauchian detects more GBA variants in LBD than PD, especially severe ones. These findings highlight the importance of accurate GBA analysis in these patients.


Assuntos
Doença por Corpos de Lewy , Doença de Parkinson , Alelos , Glucosilceramidase/genética , Heterozigoto , Humanos , Doença por Corpos de Lewy/genética , Doença de Parkinson/genética
14.
Lancet Neurol ; 21(3): 234-245, 2022 03.
Artigo em Inglês | MEDLINE | ID: mdl-35182509

RESUMO

BACKGROUND: Repeat expansion disorders affect about 1 in 3000 individuals and are clinically heterogeneous diseases caused by expansions of short tandem DNA repeats. Genetic testing is often locus-specific, resulting in underdiagnosis of people who have atypical clinical presentations, especially in paediatric patients without a previous positive family history. Whole genome sequencing is increasingly used as a first-line test for other rare genetic disorders, and we aimed to assess its performance in the diagnosis of patients with neurological repeat expansion disorders. METHODS: We retrospectively assessed the diagnostic accuracy of whole genome sequencing to detect the most common repeat expansion loci associated with neurological outcomes (AR, ATN1, ATXN1, ATXN2, ATXN3, ATXN7, C9orf72, CACNA1A, DMPK, FMR1, FXN, HTT, and TBP) using samples obtained within the National Health Service in England from patients who were suspected of having neurological disorders; previous PCR test results were used as the reference standard. The clinical accuracy of whole genome sequencing to detect repeat expansions was prospectively examined in previously genetically tested and undiagnosed patients recruited in 2013-17 to the 100 000 Genomes Project in the UK, who were suspected of having a genetic neurological disorder (familial or early-onset forms of ataxia, neuropathy, spastic paraplegia, dementia, motor neuron disease, parkinsonian movement disorders, intellectual disability, or neuromuscular disorders). If a repeat expansion call was made using whole genome sequencing, PCR was used to confirm the result. FINDINGS: The diagnostic accuracy of whole genome sequencing to detect repeat expansions was evaluated against 793 PCR tests previously performed within the NHS from 404 patients. Whole genome sequencing correctly classified 215 of 221 expanded alleles and 1316 of 1321 non-expanded alleles, showing 97·3% sensitivity (95% CI 94·2-99·0) and 99·6% specificity (99·1-99·9) across the 13 disease-associated loci when compared with PCR test results. In samples from 11 631 patients in the 100 000 Genomes Project, whole genome sequencing identified 81 repeat expansions, which were also tested by PCR: 68 were confirmed as repeat expansions in the full pathogenic range, 11 were non-pathogenic intermediate expansions or premutations, and two were non-expanded repeats (16% false discovery rate). INTERPRETATION: In our study, whole genome sequencing for the detection of repeat expansions showed high sensitivity and specificity, and it led to identification of neurological repeat expansion disorders in previously undiagnosed patients. These findings support implementation of whole genome sequencing in clinical laboratories for diagnosis of patients who have a neurological presentation consistent with a repeat expansion disorder. FUNDING: Medical Research Council, Department of Health and Social Care, National Health Service England, National Institute for Health Research, and Illumina.


Assuntos
Expansão das Repetições de DNA , Medicina Estatal , Criança , Proteína do X Frágil da Deficiência Intelectual/genética , Humanos , Estudos Prospectivos , Estudos Retrospectivos , Reino Unido , Sequenciamento Completo do Genoma/métodos
16.
Genome Med ; 13(1): 126, 2021 08 09.
Artigo em Inglês | MEDLINE | ID: mdl-34372915

RESUMO

BACKGROUND: Screening for short tandem repeat (STR) expansions in next-generation sequencing data can enable diagnosis, optimal clinical management/treatment, and accurate genetic counseling of patients with repeat expansion disorders. We aimed to develop an efficient computational workflow for reliable detection of STR expansions in next-generation sequencing data and demonstrate its clinical utility. METHODS: We characterized the performance of eight STR analysis methods (lobSTR, HipSTR, RepeatSeq, ExpansionHunter, TREDPARSE, GangSTR, STRetch, and exSTRa) on next-generation sequencing datasets of samples with known disease-causing full-mutation STR expansions and genomes simulated to harbor repeat expansions at selected loci and optimized their sensitivity. We then used a machine learning decision tree classifier to identify an optimal combination of methods for full-mutation detection. In Burrows-Wheeler Aligner (BWA)-aligned genomes, the ensemble approach of using ExpansionHunter, STRetch, and exSTRa performed the best (precision = 82%, recall = 100%, F1-score = 90%). We applied this pipeline to screen 301 families of children with suspected genetic disorders. RESULTS: We identified 10 individuals with full-mutations in the AR, ATXN1, ATXN8, DMPK, FXN, or HTT disease STR locus in the analyzed families. Additional candidates identified in our analysis include two probands with borderline ATXN2 expansions between the established repeat size range for reduced-penetrance and full-penetrance full-mutation and seven individuals with FMR1 CGG repeats in the intermediate/premutation repeat size range. In 67 probands with a prior negative clinical PCR test for the FMR1, FXN, or DMPK disease STR locus, or the spinocerebellar ataxia disease STR panel, our pipeline did not falsely identify aberrant expansion. We performed clinical PCR tests on seven (out of 10) full-mutation samples identified by our pipeline and confirmed the expansion status in all, showing absolute concordance between our bioinformatics and molecular findings. CONCLUSIONS: We have successfully demonstrated the application of a well-optimized bioinformatics pipeline that promotes the utility of genome-wide sequencing as a first-tier screening test to detect expansions of known disease STRs. Interrogating clinical next-generation sequencing data for pathogenic STR expansions using our ensemble pipeline can improve diagnostic yield and enhance clinical outcomes for patients with repeat expansion disorders.


Assuntos
Expansão das Repetições de DNA , Estudo de Associação Genômica Ampla , Sequenciamento de Nucleotídeos em Larga Escala , Repetições de Microssatélites , Sequenciamento Completo do Genoma , Algoritmos , Alelos , Tomada de Decisão Clínica , Biologia Computacional/métodos , Bases de Dados Genéticas , Árvores de Decisões , Doenças Genéticas Inatas/diagnóstico , Doenças Genéticas Inatas/genética , Loci Gênicos , Estudo de Associação Genômica Ampla/métodos , Humanos , Aprendizado de Máquina , Técnicas de Diagnóstico Molecular , Mutação , Reprodutibilidade dos Testes
17.
Ann Neurol ; 89(4): 686-697, 2021 04.
Artigo em Inglês | MEDLINE | ID: mdl-33389754

RESUMO

OBJECTIVE: The role of the survival of motor neuron (SMN) gene in amyotrophic lateral sclerosis (ALS) is unclear, with several conflicting reports. A decisive result on this topic is needed, given that treatment options are available now for SMN deficiency. METHODS: In this largest multicenter case control study to evaluate the effect of SMN1 and SMN2 copy numbers in ALS, we used whole genome sequencing data from Project MinE data freeze 2. SMN copy numbers of 6,375 patients with ALS and 2,412 controls were called from whole genome sequencing data, and the reliability of the calls was tested with multiplex ligation-dependent probe amplification data. RESULTS: The copy number distribution of SMN1 and SMN2 between cases and controls did not show any statistical differences (binomial multivariate logistic regression SMN1 p = 0.54 and SMN2 p = 0.49). In addition, the copy number of SMN did not associate with patient survival (Royston-Parmar; SMN1 p = 0.78 and SMN2 p = 0.23) or age at onset (Royston-Parmar; SMN1 p = 0.75 and SMN2 p = 0.63). INTERPRETATION: In our well-powered study, there was no association of SMN1 or SMN2 copy numbers with the risk of ALS or ALS disease severity. This suggests that changing SMN protein levels in the physiological range may not modify ALS disease course. This is an important finding in the light of emerging therapies targeted at SMN deficiencies. ANN NEUROL 2021;89:686-697.


Assuntos
Esclerose Lateral Amiotrófica/genética , Esclerose Lateral Amiotrófica/patologia , Proteína 1 de Sobrevivência do Neurônio Motor/genética , Estudos de Casos e Controles , Estudos de Coortes , Feminino , Dosagem de Genes , Humanos , Masculino , Reprodutibilidade dos Testes , Fatores de Risco , Índice de Gravidade de Doença , Proteína 2 de Sobrevivência do Neurônio Motor/genética , Sequenciamento Completo do Genoma
18.
Pharmacogenomics J ; 21(2): 251-261, 2021 04.
Artigo em Inglês | MEDLINE | ID: mdl-33462347

RESUMO

Responsible for the metabolism of ~21% of clinically used drugs, CYP2D6 is a critical component of personalized medicine initiatives. Genotyping CYP2D6 is challenging due to sequence similarity with its pseudogene paralog CYP2D7 and a high number and variety of common structural variants (SVs). Here we describe a novel bioinformatics method, Cyrius, that accurately genotypes CYP2D6 using whole-genome sequencing (WGS) data. We show that Cyrius has superior performance (96.5% concordance with truth genotypes) compared to existing methods (84-86.8%). After implementing the improvements identified from the comparison against the truth data, Cyrius's accuracy has since been improved to 99.3%. Using Cyrius, we built a haplotype frequency database from 2504 ethnically diverse samples and estimate that SV-containing star alleles are more frequent than previously reported. Cyrius will be an important tool to incorporate pharmacogenomics in WGS-based precision medicine initiatives.


Assuntos
Citocromo P-450 CYP2D6/genética , Técnicas de Genotipagem/métodos , Alelos , Biologia Computacional/métodos , Etnicidade/genética , Genótipo , Haplótipos/genética , Humanos , Polimorfismo Genético/genética , Sequenciamento Completo do Genoma/métodos
19.
Nature ; 586(7828): 292-298, 2020 10.
Artigo em Inglês | MEDLINE | ID: mdl-32999459

RESUMO

The RecQ DNA helicase WRN is a synthetic lethal target for cancer cells with microsatellite instability (MSI), a form of genetic hypermutability that arises from impaired mismatch repair1-4. Depletion of WRN induces widespread DNA double-strand breaks in MSI cells, leading to cell cycle arrest and/or apoptosis. However, the mechanism by which WRN protects MSI-associated cancers from double-strand breaks remains unclear. Here we show that TA-dinucleotide repeats are highly unstable in MSI cells and undergo large-scale expansions, distinct from previously described insertion or deletion mutations of a few nucleotides5. Expanded TA repeats form non-B DNA secondary structures that stall replication forks, activate the ATR checkpoint kinase, and require unwinding by the WRN helicase. In the absence of WRN, the expanded TA-dinucleotide repeats are susceptible to cleavage by the MUS81 nuclease, leading to massive chromosome shattering. These findings identify a distinct biomarker that underlies the synthetic lethal dependence on WRN, and support the development of therapeutic agents that target WRN for MSI-associated cancers.


Assuntos
Quebras de DNA de Cadeia Dupla , Expansão das Repetições de DNA/genética , Repetições de Dinucleotídeos/genética , Neoplasias/genética , Helicase da Síndrome de Werner/metabolismo , Proteínas Mutadas de Ataxia Telangiectasia/metabolismo , Linhagem Celular Tumoral , Cromossomos Humanos/genética , Cromossomos Humanos/metabolismo , Cromotripsia , Clivagem do DNA , Replicação do DNA , Proteínas de Ligação a DNA/metabolismo , Endodesoxirribonucleases/metabolismo , Endonucleases/metabolismo , Instabilidade Genômica , Humanos , Recombinases/metabolismo
20.
Sci Data ; 7(1): 294, 2020 09 08.
Artigo em Inglês | MEDLINE | ID: mdl-32901039

RESUMO

Significant progress has been made in elucidating single nucleotide polymorphism diversity in the human population. However, the majority of the variation space in the genome is structural and remains partially elusive. One form of structural variation is tandem repeats (TRs). Expansion of TRs are responsible for over 40 diseases, but we hypothesize these represent only a fraction of the pathogenic repeat expansions that exist. Here we characterize long or expanded TR variation in 1,115 human genomes as well as a replication cohort of 2,504 genomes, identified using ExpansionHunter Denovo. We found that individual genomes typically harbor several rare, large TRs, generally in non-coding regions of the genome. We noticed that these large TRs are enriched in their proximity to Alu elements. The vast majority of these large TRs seem to be expansions of smaller TRs that are already present in the reference genome. We are providing this TR profile as a resource for comparison to undiagnosed rare disease genomes in order to detect novel disease-causing repeat expansions.


Assuntos
Genoma Humano , Sequências de Repetição em Tandem , Elementos Alu , Conjuntos de Dados como Assunto , Humanos , Polimorfismo de Nucleotídeo Único
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...