RESUMO
Despite advances in next generation sequencing (NGS), genetic diagnoses remain elusive for many patients with neurologic syndromes. Long-read sequencing (LRS) and optical genome mapping (OGM) technologies improve upon existing capabilities in the detection and interpretation of structural variation in repetitive DNA, on a single haplotype, while also providing enhanced breakpoint resolution. We performed LRS and OGM on two patients with known chromosomal rearrangements and inconclusive Sanger or NGS. The first patient, who had epilepsy and developmental delay, had a complex translocation between two chromosomes that included insertion and inversion events. The second patient, who had a movement disorder, had an inversion on a single chromosome disrupted by multiple smaller inversions and insertions. Sequence level resolution of the rearrangements identified pathogenic breaks in noncoding sequence in or near known disease-causing genes with relevant neurologic phenotypes (MBD5, NKX2-1). These specific variants have not been reported previously, but expected molecular consequences are consistent with previously reported cases. As the use of LRS and OGM technologies for clinical testing increases and data analyses become more standardized, these methods along with multiomic data to validate noncoding variation effects will improve diagnostic yield and increase the proportion of probands with detectable pathogenic variants for known genes implicated in neurogenetic disease.
RESUMO
Low copy repeats (LCRs) are recognized as a significant source of genomic instability, driving genome variability and evolution. The Chromosome 22 LCRs (LCR22s) mediate nonallelic homologous recombination (NAHR) leading to the 22q11 deletion syndrome (22q11DS). However, LCR22s are among the most complex regions in the genome, and their structure remains unresolved. The difficulty in generating accurate maps of LCR22s has also hindered localization of the deletion end points in 22q11DS patients. Using fiber FISH and Bionano optical mapping, we assembled LCR22 alleles in 187 cell lines. Our analysis uncovered an unprecedented level of variation in LCR22s, including LCR22A alleles ranging in size from 250 to 2000 kb. Further, the incidence of various LCR22 alleles varied within different populations. Additionally, the analysis of LCR22s in 22q11DS patients and their parents enabled further refinement of the rearrangement site within LCR22A and -D, which flank the 22q11 deletion. The NAHR site was localized to a 160-kb paralog shared between the LCR22A and -D in seven 22q11DS patients. Thus, we present the most comprehensive map of LCR22 variation to date. This will greatly facilitate the investigation of the role of LCR variation as a driver of 22q11 rearrangements and the phenotypic variability among 22q11DS patients.
Assuntos
Síndrome da Deleção 22q11/genética , Mapeamento Cromossômico/métodos , Cromossomos Humanos Par 22/genética , Sequências Repetitivas de Ácido Nucleico , Animais , Linhagem Celular , Instabilidade Cromossômica , Evolução Molecular , Humanos , Hibridização in Situ Fluorescente , Primatas/genéticaRESUMO
Schizophrenia occurs in about one in four individuals with 22q11.2 deletion syndrome (22q11.2DS). The aim of this International Brain and Behavior 22q11.2DS Consortium (IBBC) study was to identify genetic factors that contribute to schizophrenia, in addition to the ~20-fold increased risk conveyed by the 22q11.2 deletion. Using whole-genome sequencing data from 519 unrelated individuals with 22q11.2DS, we conducted genome-wide comparisons of common and rare variants between those with schizophrenia and those with no psychotic disorder at age ≥25 years. Available microarray data enabled direct comparison of polygenic risk for schizophrenia between 22q11.2DS and independent population samples with no 22q11.2 deletion, with and without schizophrenia (total n = 35,182). Polygenic risk for schizophrenia within 22q11.2DS was significantly greater for those with schizophrenia (padj = 6.73 × 10-6). Novel reciprocal case-control comparisons between the 22q11.2DS and population-based cohorts showed that polygenic risk score was significantly greater in individuals with psychotic illness, regardless of the presence of the 22q11.2 deletion. Within the 22q11.2DS cohort, results of gene-set analyses showed some support for rare variants affecting synaptic genes. No common or rare variants within the 22q11.2 deletion region were significantly associated with schizophrenia. These findings suggest that in addition to the deletion conferring a greatly increased risk to schizophrenia, the risk is higher when the 22q11.2 deletion and common polygenic risk factors that contribute to schizophrenia in the general population are both present.
Assuntos
Síndrome de DiGeorge , Transtornos Psicóticos , Esquizofrenia , Adulto , Estudos de Casos e Controles , Estudos de Coortes , Síndrome de DiGeorge/genética , Humanos , Esquizofrenia/genéticaRESUMO
The majority (99%) of individuals with 22q11.2 deletion syndrome (22q11.2DS) have a deletion that is caused by non-allelic homologous recombination between two of four low copy repeat clusters on chromosome 22q11.2 (LCR22s). However, in a small subset of patients, atypical deletions are observed with at least one deletion breakpoint within unique sequence between the LCR22s. The position of the chromosome breakpoints and the mechanisms driving those atypical deletions remain poorly studied. Our large-scale, whole genome sequencing study of >1500 subjects with 22q11.2DS identified six unrelated individuals with atypical deletions of different types. Using a combination of whole genome sequencing data and fiber-fluorescence in situ hybridization, we mapped the rearranged alleles in these subjects. In four of them, the distal breakpoints mapped within one of the LCR22s and we found that the deletions likely occurred by replication-based mechanisms. Interestingly, in two of them, an inversion probably preceded inter-chromosomal 'allelic' homologous recombination between differently oriented LCR22-D alleles. Inversion associated allelic homologous recombination (AHR) may well be a common mechanism driving (atypical) deletions on 22q11.2.
Assuntos
Síndrome de DiGeorge/genética , Síndrome de DiGeorge/metabolismo , Recombinação Homóloga/genética , Adulto , Alelos , Pontos de Quebra do Cromossomo , Deleção Cromossômica , Inversão Cromossômica/genética , Mapeamento Cromossômico/métodos , Cromossomos/genética , Cromossomos Humanos Par 22/genética , Feminino , Humanos , Hibridização in Situ Fluorescente/métodos , Masculino , Duplicações Segmentares Genômicas/genética , Sequenciamento Completo do Genoma/métodosRESUMO
Recurrent, de novo, meiotic non-allelic homologous recombination events between low copy repeats, termed LCR22s, leads to the 22q11.2 deletion syndrome (22q11.2DS; velo-cardio-facial syndrome/DiGeorge syndrome). Although most 22q11.2DS patients have a similar sized 3 million base pair (Mb), LCR22A-D deletion, some have nested LCR22A-B or LCR22A-C deletions. Our goal is to identify additional recurrent 22q11.2 deletions associated with 22q11.2DS, serving as recombination hotspots for meiotic chromosomal rearrangements. Here, using data from Affymetrix 6.0 microarrays on 1680 22q11.2DS subjects, we identified what appeared to be a nested proximal 22q11.2 deletion in 38 (2.3%) of them. Using molecular and haplotype analyses from 14 subjects and their parent(s) with available DNA, we found essentially three types of scenarios to explain this observation. In eight subjects, the proximal breakpoints occurred in a small sized 12 kb LCR distal to LCR22A, referred to LCR22A+, resulting in LCR22A+-B or LCR22A+-D deletions. Six of these eight subjects had a nested 22q11.2 deletion that occurred during meiosis in a parent carrying a benign 0.2 Mb duplication of the LCR22A-LCR22A+ region with a breakpoint in LCR22A+. Another six had a typical de novo LCR22A-D deletion on one allele and inherited the LCR22A-A+ duplication from the other parent thus appearing on microarrays to have a nested deletion. LCR22A+ maps to an evolutionary breakpoint between mice and humans and appears to serve as a local hotspot for chromosome rearrangements on 22q11.2.
Assuntos
Alelos , Mapeamento Cromossômico , Síndrome de DiGeorge/genética , Meiose , Deleção Cromossômica , Cromossomos Humanos Par 22/genética , Feminino , Humanos , MasculinoRESUMO
Inversion polymorphisms between low-copy repeats (LCRs) might predispose chromosomes to meiotic non-allelic homologous recombination (NAHR) events and thus lead to genomic disorders. However, for the 22q11.2 deletion syndrome (22q11.2DS), the most common genomic disorder, no such inversions have been uncovered as of yet. Using fiber-FISH, we demonstrate that parents transmitting the de novo 3 Mb LCR22A-D 22q11.2 deletion, the reciprocal duplication, and the smaller 1.5 Mb LCR22A-B 22q11.2 deletion carry inversions of LCR22B-D or LCR22C-D. Hence, the inversions predispose chromosome 22q11.2 to meiotic rearrangements and increase the individual risk for transmitting rearrangements. Interestingly, the inversions are nested or flanking rather than coinciding with the deletion or duplication sizes. This finding raises the possibility that inversions are a prerequisite not only for 22q11.2 rearrangements but also for all NAHR-mediated genomic disorders.
Assuntos
Inversão Cromossômica , Síndrome de DiGeorge/genética , Predisposição Genética para Doença , Meiose , Polimorfismo de Nucleotídeo Único , Deleção Cromossômica , Variações do Número de Cópias de DNA , Síndrome de DiGeorge/patologia , Recombinação Homóloga , Humanos , Hibridização in Situ Fluorescente/métodosRESUMO
PURPOSE: The 22q11.2 deletion syndrome (22q11.2DS) is the most common microdeletion in humans, with highly variable phenotypic expression. Whereas congenital heart defects, palatal anomalies, immunodeficiency, hypoparathyroidism, and neuropsychiatric conditions are observed in over 50% of patients with 22q11DS, a subset of patients present with additional "atypical" findings such as craniosynostosis and anorectal malformations. Recently, pathogenic variants in the CDC45 (Cell Division Cycle protein 45) gene, located within the LCR22A-LCR22B region of chromosome 22q11.2, were noted to be involved in the pathogenesis of craniosynostosis. METHODS: We performed next-generation sequencing on DNA from 15 patients with 22q11.2DS and atypical phenotypic features such as craniosynostosis, short stature, skeletal differences, and anorectal malformations. RESULTS: We identified four novel rare nonsynonymous variants in CDC45 in 5/15 patients with 22q11.2DS and craniosynostosis and/or other atypical findings. CONCLUSION: This study supports CDC45 as a causative gene in craniosynostosis, as well as a number of other anomalies. We suggest that this association results in a condition independent of Meier-Gorlin syndrome, perhaps representing a novel condition and/or a cause of features associated with Baller-Gerold syndrome. In addition, this work confirms that the phenotypic variability observed in a subset of patients with 22q11.2DS is due to pathogenic variants on the nondeleted chromosome.
Assuntos
Proteínas de Ciclo Celular/genética , Síndrome de DiGeorge/genética , Alelos , Proteínas de Ciclo Celular/metabolismo , Criança , Pré-Escolar , Deleção Cromossômica , Cromossomos/genética , Cromossomos Humanos Par 22/genética , Craniossinostoses/genética , Síndrome de DiGeorge/metabolismo , Feminino , Cardiopatias Congênitas/genética , Humanos , Masculino , Fenótipo , Estudos RetrospectivosRESUMO
Short read massive parallel sequencing has emerged as a standard diagnostic tool in the medical setting. However, short read technologies have inherent limitations such as GC bias, difficulties mapping to repetitive elements, trouble discriminating paralogous sequences, and difficulties in phasing alleles. Long read single molecule sequencers resolve these obstacles. Moreover, they offer higher consensus accuracies and can detect epigenetic modifications from native DNA. The first commercially available long read single molecule platform was the RS system based on PacBio's single molecule real-time (SMRT) sequencing technology, which has since evolved into their RSII and Sequel systems. Here we capsulize how SMRT sequencing is revolutionizing constitutional, reproductive, cancer, microbial and viral genetic testing.
Assuntos
Infecções Bacterianas/genética , DNA/química , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Técnicas de Diagnóstico Molecular/métodos , Neoplasias/genética , Viroses/genética , Infecções Bacterianas/diagnóstico , DNA/genética , Humanos , Modelos Moleculares , Neoplasias/diagnóstico , Conformação de Ácido Nucleico , Sensibilidade e Especificidade , Viroses/diagnósticoRESUMO
Velo-cardio-facial syndrome/DiGeorge syndrome/22q11.2 deletion syndrome (22q11.2DS) is caused by meiotic non-allelic homologous recombination events between flanking low copy repeats termed LCR22A and LCR22D, resulting in a 3 million base pair (Mb) deletion. Due to their complex structure, large size and high sequence identity, genetic variation within LCR22s among different individuals has not been well characterized. In this study, we sequenced 13 BAC clones derived from LCR22A/D and aligned them with 15 previously available BAC sequences to create a new genetic variation map. The thousands of variants identified by this analysis were not uniformly distributed in the two LCR22s. Moreover, shared single nucleotide variants between LCR22A and LCR22D were enriched in the Breakpoint Cluster Region pseudogene (BCRP) block, suggesting the existence of a possible recombination hotspot there. Interestingly, breakpoints for atypical 22q11.2 rearrangements have previously been located to BCRPs To further explore this finding, we carried out in-depth analyses of whole genome sequence (WGS) data from two unrelated probands harbouring a de novo 3Mb 22q11.2 deletion and their normal parents. By focusing primarily on WGS reads uniquely mapped to LCR22A, using the variation map from our BAC analysis to help resolve allele ambiguity, and by performing PCR analysis, we infer that the deletion breakpoints were most likely located near or within the BCRP module. In summary, we found a high degree of sequence variation in LCR22A and LCR22D and a potential recombination breakpoint near or within the BCRP block, providing a starting point for future breakpoint mapping using additional trios.
Assuntos
Pontos de Quebra do Cromossomo , Síndrome de DiGeorge/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Membro 2 da Subfamília G de Transportadores de Cassetes de Ligação de ATP/genética , Cromossomos Artificiais Bacterianos/genética , Cromossomos Humanos Par 22/genética , Estudo de Associação Genômica Ampla , Humanos , Proteínas de Neoplasias/genética , Polimorfismo de Nucleotídeo Único , Deleção de SequênciaRESUMO
The FMR1 gene contains an unstable CGG repeat in its 5' untranslated region. Premutation alleles range between 55 and 200 repeat units and confer a risk for developing fragile X-associated tremor/ataxia syndrome or fragile X-associated primary ovarian insufficiency. Furthermore, the premutation allele often expands to a full mutation during female germline transmission giving rise to the fragile X syndrome. The risk for a premutation to expand depends mainly on the number of CGG units and the presence of AGG interruptions in the CGG repeat. Unfortunately, the detection of AGG interruptions is hampered by technical difficulties. Here, we demonstrate that single-molecule sequencing enables the determination of not only the repeat size, but also the complete repeat sequence including AGG interruptions in male and female alleles with repeats ranging from 45 to 100 CGG units. We envision this method will facilitate research and diagnostic analysis of the FMR1 repeat expansion.
Assuntos
Ataxia/genética , Proteína do X Frágil da Deficiência Intelectual/genética , Síndrome do Cromossomo X Frágil/genética , Heterozigoto , Mutação , Tremor/genética , Expansão das Repetições de Trinucleotídeos , Ataxia/diagnóstico , Análise Mutacional de DNA , Feminino , Síndrome do Cromossomo X Frágil/diagnóstico , Humanos , Masculino , Tremor/diagnóstico , Repetições de TrinucleotídeosRESUMO
The human sex chromosomes differ in sequence, except for the pseudoautosomal regions (PAR) at the terminus of the short and the long arms, denoted as PAR1 and PAR2. The boundary between PAR1 and the unique X and Y sequences was established during the divergence of the great apes. During a copy number variation screen, we noted a paternally inherited chromosome X duplication in 15 independent families. Subsequent genomic analysis demonstrated that an insertional translocation of X chromosomal sequence into the Y chromosome generates an extended PAR [corrected].The insertion is generated by non-allelic homologous recombination between a 548 bp LTR6B repeat within the Y chromosome PAR1 and a second LTR6B repeat located 105 kb from the PAR boundary on the X chromosome. The identification of the reciprocal deletion on the X chromosome in one family and the occurrence of the variant in different chromosome Y haplogroups demonstrate this is a recurrent genomic rearrangement in the human population. This finding represents a novel mechanism shaping sex chromosomal evolution.
Assuntos
Cromossomos Humanos X/genética , Cromossomos Humanos Y/genética , Evolução Molecular , Animais , Cromossomos/genética , Haplótipos , Hominidae/genética , Recombinação Homóloga/genética , Humanos , Polimorfismo Genético , Sequências Repetitivas de Ácido Nucleico/genética , Translocação GenéticaRESUMO
Chromoanagenesis is the process by which a single catastrophic event creates complex rearrangements confined to a single or a few chromosomes. It is usually characterized by the presence of multiple deletions and/or duplications, as well as by copy neutral rearrangements. In contrast, an array CGH screen of patients with developmental anomalies revealed three patients in which a single chromosome carries from 8 to 11 large copy number gains confined to a single chromosome or chromosomal arm, but the absence of deletions. Subsequent fluorescence in situ hybiridization and massive parallel sequencing revealed the duplicons to be clustered together in distinct locations across the altered chromosomes. Breakpoint junction sequences showed both microhomology and non-templated insertions of up to 40 bp. Hence, these patients each demonstrate a single altered chromosome of clustered insertional duplications, no deletions, and breakpoint junction sequences showing microhomology and/or non-templated insertions. These observations are difficult to reconcile with current mechanistic descriptions of chromothripsis and chromoanasynthesis. Therefore, we hypothesize those rearrangements to be of a mechanistically different origin. In addition, we suggest that large untemplated insertional sequences observed at breakpoints are driven by a non-canonical non-homologous end joining mechanism.
Assuntos
Aberrações Cromossômicas , Variações do Número de Cópias de DNA , Deficiências do Desenvolvimento/genética , Hibridização Genômica Comparativa , Feminino , Genoma Humano , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Hibridização in Situ Fluorescente , Masculino , Análise em Microsséries , Análise de Sequência de DNARESUMO
BACKGROUND: Massive parallel sequencing is a powerful tool for variant discovery and genotyping. To reduce costs, sequencing of restriction enzyme based reduced representation libraries can be utilized. This technology is generally referred to as Genotyping By Sequencing (GBS). To deal with GBS experimental design and initial processing specific bioinformatic tools are needed. RESULTS: GBSX is a package that assists in selecting the appropriate enzyme and the design of compatible in-line barcodes. Post sequencing, it performs optimized demultiplexing using these barcodes to create fastq files per barcode which can easily be plugged into existing variant analysis pipelines. Here we demonstrate the usability of the GBSX toolkit and demonstrate improved in-line barcode demultiplexing and trimming performance compared to existing tools. CONCLUSIONS: GBSX provides an easy to use suite of tools for designing and demultiplexing of GBS experiments.
Assuntos
Técnicas de Genotipagem/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Software , Enzimas de Restrição do DNAAssuntos
Infecções Bacterianas/genética , Infecções Bacterianas/imunologia , Interleucina-6 , Receptores Tipo I de Interleucina-1 , Receptores Toll-Like , Infecções Bacterianas/patologia , Feminino , Humanos , Interleucina-6/genética , Interleucina-6/imunologia , Masculino , Receptores Tipo I de Interleucina-1/genética , Receptores Tipo I de Interleucina-1/imunologia , Receptores Toll-Like/genética , Receptores Toll-Like/imunologiaRESUMO
Loss-of-function (LoF) variants in the filaggrin (FLG) gene are the strongest known genetic risk factor for atopic dermatitis (AD), but the impact of these variants on AD outcomes is poorly understood. We comprehensively identified genetic variants through targeted region sequencing of FLG in children participating in the Mechanisms of Progression of Atopic Dermatitis to Asthma in Children cohort. Twenty FLG LoF variants were identified, including 1 novel variant and 9 variants not previously associated with AD. FLG LoF variants were found in the cohort. Among these children, the presence of 1 or more FLG LoF variants was associated with moderate/severe AD compared with those with mild AD. Children with FLG LoF variants had a higher SCORing for Atopic Dermatitis (SCORAD) and higher likelihood of food allergy within the first 2.5 years of life. LoF variants were associated with higher transepidermal water loss (TEWL) in both lesional and nonlesional skin. Collectively, our study identifies established and potentially novel AD-associated FLG LoF variants and associates FLG LoF variants with higher TEWL in lesional and nonlesional skin.
Assuntos
Dermatite Atópica , Proteínas Filagrinas , Proteínas de Filamentos Intermediários , Mutação com Perda de Função , Fenótipo , Dermatite Atópica/genética , Dermatite Atópica/patologia , Humanos , Masculino , Feminino , Pré-Escolar , Estudos Prospectivos , Lactente , Proteínas de Filamentos Intermediários/genética , Predisposição Genética para Doença , Criança , Hipersensibilidade Alimentar/genéticaRESUMO
Next-generation sequencing is excellently suited to evaluate the abundance of mRNAs to study gene expression. Here we compare two alternative technologies, cap analysis of gene expression (CAGE) and serial analysis of gene expression (SAGE), for the same RNA samples. Along with quantifying gene expression levels, CAGE can be used to identify tissue-specific transcription start sites, while SAGE monitors 3'-end usage. We used both methods to get more insight into the transcriptional control of myogenesis, studying differential gene expression in differentiated and proliferating C2C12 myoblast cells with statistical evaluation of reproducibility and differential gene expression. Both CAGE and SAGE provided highly reproducible data (Pearson's correlations >0.92 among biological triplicates). With both methods we found around 10,000 genes expressed at levels >2 transcripts per million (approximately 0.3 copies per cell), with an overlap of 86%. We identified 4304 and 3846 genes differentially expressed between proliferating and differentiated C2C12 cells by CAGE and SAGE, respectively, with an overlap of 2144. We identified 196 novel regulatory regions with preferential use in proliferating or differentiated cells. Next-generation sequencing of CAGE and SAGE libraries provides consistent expression levels and can enrich current genome annotations with tissue-specific promoters and alternative 3'-UTR usage.
Assuntos
Perfilação da Expressão Gênica/métodos , Mioblastos/metabolismo , Análise de Sequência de RNA , Regiões 3' não Traduzidas , Animais , Linhagem Celular , Camundongos , Modelos Biológicos , Desenvolvimento Muscular/genética , Análise de Sequência com Séries de Oligonucleotídeos , Reprodutibilidade dos Testes , Alinhamento de Sequência , Sítio de Iniciação de TranscriçãoRESUMO
Despite high levels of homology, transcription coactivators p300 and CREB binding protein (CBP) are both indispensable during embryogenesis. They are largely known to regulate the same genes. To identify genes preferentially regulated by p300 or CBP, we performed an extensive genome-wide survey using the ChIP-seq on cell-cycle synchronized cells. We found that 57% of the tags were within genes or proximal promoters, with an overall preference for binding to transcription start and end sites. The heterogeneous binding patterns possibly reflect the divergent roles of CBP and p300 in transcriptional regulation. Most of the 16 103 genes were bound by both CBP and p300. However, after stimulation 89 and 1944 genes were preferentially bound by CBP or p300, respectively. Target genes were found to be primarily involved in the regulation of metabolic and developmental processes, and transcription, with CBP showing a stronger preference than p300 for genes active in negative regulation of transcription. Analysis of transcription factor binding sites suggest that CBP and p300 have many partners in common, but AP-1 and Serum Response Factor (SRF) appear to be more prominent in CBP-specific sequences, whereas AP-2 and SP1 are enriched in p300-specific targets. Taken together, our findings further elucidate the distinct roles of coactivators p300 and CBP in transcriptional regulation.
Assuntos
Proteína de Ligação a CREB/metabolismo , Proteína p300 Associada a E1A/metabolismo , Regulação da Expressão Gênica , Transcrição Gênica , Sequência de Bases , Sítios de Ligação , Linhagem Celular Tumoral , Imunoprecipitação da Cromatina , Sequência Consenso , Genoma Humano , Humanos , Regiões Promotoras Genéticas , Reprodutibilidade dos Testes , Análise de Sequência de DNARESUMO
The human genome harbors numerous structural variants (SVs) which, due to their repetitive nature, are currently underexplored in short-read whole-genome sequencing approaches. Using single-molecule, real-time (SMRT) long-read sequencing technology in combination with FALCON-Unzip, we generated a de novo assembly of the diploid genome of a 115-year-old Dutch cognitively healthy woman. We combined this assembly with two previously published haploid assemblies (CHM1 and CHM13) and the GRCh38 reference genome to create a compendium of SVs that occur across five independent human haplotypes using the graph-based multi-genome aligner REVEAL. Across these five haplotypes, we detected 31,680 euchromatic SVs (>50 bp). Of these, ~62% were comprised of repetitive sequences with 'variable number tandem repeats' (VNTRs), ~10% were mobile elements (Alu, L1, and SVA), while the remaining variants were inversions and indels. We observed that VNTRs with GC-content >60% and repeat patterns longer than 15 bp were 21-fold enriched in the subtelomeric regions (within 5 Mb of the ends of chromosome arms). VNTR lengths can expand to exceed a critical length which is associated with impaired gene transcription. The genes that contained most VNTRs, of which PTPRN2 and DLGAP2 are the most prominent examples, were found to be predominantly expressed in the brain and associated with a wide variety of neurological disorders. Repeat-induced variation represents a sizeable fraction of the genetic variation in human genomes and should be included in investigations of genetic factors associated with phenotypic traits, specifically those associated with neurological disorders. We make available the long and short-read sequence data of the supercentenarian genome, and a compendium of SVs as identified across 5 human haplotypes.
Assuntos
Genoma Humano , Repetições Minissatélites , Idoso de 80 Anos ou mais , Encéfalo , Feminino , Haplótipos , Humanos , Repetições Minissatélites/genética , Análise de Sequência de DNARESUMO
The adoption of single molecule real-time (SMRT) sequencing [...].