RESUMEN
Complex structural variations (cxSVs) are often overlooked in genome analyses due to detection challenges. We developed ARC-SV, a probabilistic and machine-learning-based method that enables accurate detection and reconstruction of cxSVs from standard datasets. By applying ARC-SV across 4,262 genomes representing all continental populations, we identified cxSVs as a significant source of natural human genetic variation. Rare cxSVs have a propensity to occur in neural genes and loci that underwent rapid human-specific evolution, including those regulating corticogenesis. By performing single-nucleus multiomics in postmortem brains, we discovered cxSVs associated with differential gene expression and chromatin accessibility across various brain regions and cell types. Additionally, cxSVs detected in brains of psychiatric cases are enriched for linkage with psychiatric GWAS risk alleles detected in the same brains. Furthermore, our analysis revealed significantly decreased brain-region- and cell-type-specific expression of cxSV genes, specifically for psychiatric cases, implicating cxSVs in the molecular etiology of major neuropsychiatric disorders.
RESUMEN
We developed a generally applicable method, CRISPR/Cas9-targeted long-read sequencing (CTLR-Seq), to resolve, haplotype-specifically, the large and complex regions in the human genome that had been previously impenetrable to sequencing analysis, such as large segmental duplications (SegDups) and their associated genome rearrangements. CTLR-Seq combines in vitro Cas9-mediated cutting of the genome and pulse-field gel electrophoresis to isolate intact large (i.e., up to 2,000 kb) genomic regions that encompass previously unresolvable genomic sequences. These targets are then sequenced (amplification-free) at high on-target coverage using long-read sequencing, allowing for their complete sequence assembly. We applied CTLR-Seq to the SegDup-mediated rearrangements that constitute the boundaries of, and give rise to, the 22q11.2 Deletion Syndrome (22q11DS), the most common human microdeletion disorder. We then performed de novo assembly to resolve, at base-pair resolution, the full sequence rearrangements and exact chromosomal breakpoints of 22q11.2DS (including all common subtypes). Across multiple patients, we found a high degree of variability for both the rearranged SegDup sequences and the exact chromosomal breakpoint locations, which coincide with various transposons within the 22q11.2 SegDups, suggesting that 22q11DS can be driven by transposon-mediated genome recombination. Guided by CTLR-Seq results from two 22q11DS patients, we performed three-dimensional chromosomal folding analysis for the 22q11.2 SegDups from patient-derived neurons and astrocytes and found chromosome interactions anchored within the SegDups to be both cell type-specific and patient-specific. Lastly, we demonstrated that CTLR-Seq enables cell-type specific analysis of DNA methylation patterns within the deletion haplotype of 22q11DS.
Asunto(s)
Síndrome de DiGeorge , Humanos , Síndrome de DiGeorge/genética , Sistemas CRISPR-Cas , Puntos de Rotura del Cromosoma , Cromosomas Humanos Par 22/genética , Genoma Humano , Reordenamiento Génico , Análisis de Secuencia de ADN/métodos , Deleción CromosómicaRESUMEN
Pediatric acute-onset neuropsychiatric syndrome (PANS) is an abrupt-onset neuropsychiatric disorder. PANS patients have an increased prevalence of comorbid autoimmune illness, most commonly arthritis. In addition, an estimated one-third of PANS patients present with low serum C4 protein, suggesting decreased production or increased consumption of C4 protein. To test the possibility that copy number (CN) variation contributes to risk of PANS illness, we compared mean total C4A and total C4B CN in ethnically matched subjects from PANS DNA samples and controls (192 cases and 182 controls). Longitudinal data from the Stanford PANS cohort (n = 121) were used to assess whether the time to juvenile idiopathic arthritis (JIA) or autoimmune disease (AI) onset was a function of total C4A or C4B CN. Lastly, we performed several hypothesis-generating analyses to explore the correlation between individual C4 gene variants, sex, specific genotypes, and age of PANS onset. Although the mean total C4A or C4B CN did not differ in PANS compared to controls, PANS patients with low C4B CN were at increased risk for subsequent JIA diagnosis (hazard ratio = 2.7, p value = 0.004). We also observed a possible increase in risk for AI in PANS patients and a possible correlation between lower C4B and PANS age of onset. An association between rheumatoid arthritis and low C4B CN has been reported previously. However, patients with PANS develop different types of JIA: enthesitis-related arthritis, spondyloarthritis, and psoriatic arthritis. This suggests that C4B plays a role that spans these arthritis types.
Asunto(s)
Artritis , Complemento C4b , Humanos , Niño , Complemento C4b/genética , Complemento C4a/genética , Dosificación de Gen , Genotipo , Artritis/genéticaRESUMEN
In both Turner syndrome (TS) and Klinefelter syndrome (KS) copy number aberrations of the X chromosome lead to various developmental symptoms. We report a comparative analysis of TS vs. KS regarding differences at the genomic network level measured in primary samples by analyzing gene expression, DNA methylation, and chromatin conformation. X-chromosome inactivation (XCI) silences transcription from one X chromosome in female mammals, on which most genes are inactive, and some genes escape from XCI. In TS, almost all differentially expressed escape genes are down-regulated but most differentially expressed inactive genes are up-regulated. In KS, differentially expressed escape genes are up-regulated while the majority of inactive genes appear unchanged. Interestingly, 94 differentially expressed genes (DEGs) overlapped between TS and female and KS and male comparisons; and these almost uniformly display expression changes into opposite directions. DEGs on the X chromosome and the autosomes are coexpressed in both syndromes, indicating that there are molecular ripple effects of the changes in X chromosome dosage. Six potential candidate genes (RPS4X, SEPT6, NKRF, CX0rf57, NAA10, and FLNA) for KS are identified on Xq, as well as candidate central genes on Xp for TS. Only promoters of inactive genes are differentially methylated in both syndromes while escape gene promoters remain unchanged. The intrachromosomal contact map of the X chromosome in TS exhibits the structure of an active X chromosome. The discovery of shared DEGs indicates the existence of common molecular mechanisms for gene regulation in TS and KS that transmit the gene dosage changes to the transcriptome.
Asunto(s)
Dosificación de Gen , Regulación de la Expresión Génica , Genómica , Síndrome de Klinefelter/genética , Síndrome de Turner/genética , Cromosoma X , Animales , Cromatina/química , Cromosomas Humanos X , Metilación de ADN , Femenino , Filaminas , Humanos , Cariotipo , Masculino , Mamíferos/genética , Acetiltransferasa A N-Terminal , Acetiltransferasa E N-Terminal , Proteínas Serina-Treonina Quinasas/genética , Receptor PAR-2 , Proteínas Represoras/genética , Septinas , Transcriptoma/genética , Inactivación del Cromosoma XRESUMEN
K562 is widely used in biomedical research. It is one of three tier-one cell lines of ENCODE and also most commonly used for large-scale CRISPR/Cas9 screens. Although its functional genomic and epigenomic characteristics have been extensively studied, its genome sequence and genomic structural features have never been comprehensively analyzed. Such information is essential for the correct interpretation and understanding of the vast troves of existing functional genomics and epigenomics data for K562. We performed and integrated deep-coverage whole-genome (short-insert), mate-pair, and linked-read sequencing as well as karyotyping and array CGH analysis to identify a wide spectrum of genome characteristics in K562: copy numbers (CN) of aneuploid chromosome segments at high-resolution, SNVs and indels (both corrected for CN in aneuploid regions), loss of heterozygosity, megabase-scale phased haplotypes often spanning entire chromosome arms, structural variants (SVs), including small and large-scale complex SVs and nonreference retrotransposon insertions. Many SVs were phased, assembled, and experimentally validated. We identified multiple allele-specific deletions and duplications within the tumor suppressor gene FHIT Taking aneuploidy into account, we reanalyzed K562 RNA-seq and whole-genome bisulfite sequencing data for allele-specific expression and allele-specific DNA methylation. We also show examples of how deeper insights into regulatory complexity are gained by integrating genomic variant information and structural context with functional genomics and epigenomics data. Furthermore, using K562 haplotype information, we produced an allele-specific CRISPR targeting map. This comprehensive whole-genome analysis serves as a resource for future studies that utilize K562 as well as a framework for the analysis of other cancer genomes.
Asunto(s)
Genoma Humano , Humanos , Células K562 , Cariotipo , Polimorfismo Genético , Secuenciación Completa del GenomaRESUMEN
HepG2 is one of the most widely used human cancer cell lines in biomedical research and one of the main cell lines of ENCODE. Although the functional genomic and epigenomic characteristics of HepG2 are extensively studied, its genome sequence has never been comprehensively analyzed and higher order genomic structural features are largely unknown. The high degree of aneuploidy in HepG2 renders traditional genome variant analysis methods challenging and partially ineffective. Correct and complete interpretation of the extensive functional genomics data from HepG2 requires an understanding of the cell line's genome sequence and genome structure. Using a variety of sequencing and analysis methods, we identified a wide spectrum of genome characteristics in HepG2: copy numbers of chromosomal segments at high resolution, SNVs and Indels (corrected for aneuploidy), regions with loss of heterozygosity, phased haplotypes extending to entire chromosome arms, retrotransposon insertions and structural variants (SVs) including complex and somatic genomic rearrangements. A large number of SVs were phased, sequence assembled and experimentally validated. We re-analyzed published HepG2 datasets for allele-specific expression and DNA methylation and assembled an allele-specific CRISPR/Cas9 targeting map. We demonstrate how deeper insights into genomic regulatory complexity are gained by adopting a genome-integrated framework.
Asunto(s)
Mapeo Cromosómico/métodos , Genoma Humano , Genómica/métodos , Haplotipos , Análisis de Secuencia de ADN/estadística & datos numéricos , Alelos , Aneuploidia , Metilación de ADN , Variación Estructural del Genoma , Células Hep G2 , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Mutación INDEL , Cariotipificación , Pérdida de Heterocigocidad , Polimorfismo de Nucleótido Simple , RetroelementosRESUMEN
Few studies have been conducted to understand post-zygotic accumulation of mutations in cells of the healthy human body. We reprogrammed 32 skin fibroblast cells from families of donors into human induced pluripotent stem cell (hiPSC) lines. The clonal nature of hiPSC lines allows a high-resolution analysis of the genomes of the founder fibroblast cells without being confounded by the artifacts of single-cell whole-genome amplification. We estimate that on average a fibroblast cell in children has 1035 mostly benign mosaic SNVs. On average, 235 SNVs could be directly confirmed in the original fibroblast population by ultradeep sequencing, down to an allele frequency (AF) of 0.1%. More sensitive droplet digital PCR experiments confirmed more SNVs as mosaic with AF as low as 0.01%, suggesting that 1035 mosaic SNVs per fibroblast cell is the true average. Similar analyses in adults revealed no significant increase in the number of SNVs per cell, suggesting that a major fraction of mosaic SNVs in fibroblasts arises during development. Mosaic SNVs were distributed uniformly across the genome and were enriched in a mutational signature previously observed in cancers and in de novo variants and which, we hypothesize, is a hallmark of normal cell proliferation. Finally, AF distribution of mosaic SNVs had distinct narrow peaks, which could be a characteristic of clonal cell selection, clonal expansion, or both. These findings reveal a large degree of somatic mosaicism in healthy human tissues, link de novo and cancer mutations to somatic mosaicism, and couple somatic mosaicism with cell proliferation.
Asunto(s)
Evolución Clonal , Variaciones en el Número de Copia de ADN , Fibroblastos/citología , Mosaicismo , Acumulación de Mutaciones , Proliferación Celular , Células Cultivadas , Fibroblastos/metabolismo , Humanos , Células Madre Pluripotentes Inducidas/citología , Células Madre Pluripotentes Inducidas/metabolismo , Piel/citologíaRESUMEN
BACKGROUND: Copy number variation (CNV) analysis is an integral component of the study of human genomes in both research and clinical settings. Array-based CNV analysis is the current first-tier approach in clinical cytogenetics. Decreasing costs in high-throughput sequencing and cloud computing have opened doors for the development of sequencing-based CNV analysis pipelines with fast turnaround times. We carry out a systematic and quantitative comparative analysis for several low-coverage whole-genome sequencing (WGS) strategies to detect CNV in the human genome. METHODS: We compared the CNV detection capabilities of WGS strategies (short insert, 3 kb insert mate pair and 5 kb insert mate pair) each at 1×, 3× and 5× coverages relative to each other and to 17 currently used high-density oligonucleotide arrays. For benchmarking, we used a set of gold standard (GS) CNVs generated for the 1000 Genomes Project CEU subject NA12878. RESULTS: Overall, low-coverage WGS strategies detect drastically more GS CNVs compared with arrays and are accompanied with smaller percentages of CNV calls without validation. Furthermore, we show that WGS (at ≥1× coverage) is able to detect all seven GS deletion CNVs >100 kb in NA12878, whereas only one is detected by most arrays. Lastly, we show that the much larger 15 Mbp Cri du chat deletion can be readily detected with short-insert paired-end WGS at even just 1× coverage. CONCLUSIONS: CNV analysis using low-coverage WGS is efficient and outperforms the array-based analysis that is currently used for clinical cytogenetics.
Asunto(s)
Hibridación Genómica Comparativa , Variaciones en el Número de Copia de ADN , Genoma Humano , Genómica , Secuenciación Completa del Genoma , Hibridación Genómica Comparativa/métodos , Hibridación Genómica Comparativa/normas , Estudios de Asociación Genética/métodos , Estudios de Asociación Genética/normas , Predisposición Genética a la Enfermedad , Pruebas Genéticas , Genómica/métodos , Genómica/normas , Humanos , Estándares de Referencia , Reproducibilidad de los Resultados , Sensibilidad y EspecificidadRESUMEN
Sleep quality declines with age; however, the underlying mechanisms remain elusive. We found that hyperexcitable hypocretin/orexin (Hcrt/OX) neurons drive sleep fragmentation during aging. In aged mice, Hcrt neurons exhibited more frequent neuronal activity epochs driving wake bouts, and optogenetic activation of Hcrt neurons elicited more prolonged wakefulness. Aged Hcrt neurons showed hyperexcitability with lower KCNQ2 expression and impaired M-current, mediated by KCNQ2/3 channels. Single-nucleus RNA-sequencing revealed adaptive changes to Hcrt neuron loss in the aging brain. Disruption of Kcnq2/3 genes in Hcrt neurons of young mice destabilized sleep, mimicking aging-associated sleep fragmentation, whereas the KCNQ-selective activator flupirtine hyperpolarized Hcrt neurons and rejuvenated sleep architecture in aged mice. Our findings demonstrate a mechanism underlying sleep instability during aging and a strategy to improve sleep continuity.
Asunto(s)
Envejecimiento , Neuronas/fisiología , Orexinas/fisiología , Privación de Sueño/fisiopatología , Sueño , Vigilia , Aminopiridinas/farmacología , Animales , Sistemas CRISPR-Cas , Electroencefalografía , Electromiografía , Femenino , Área Hipotalámica Lateral/fisiopatología , Canal de Potasio KCNQ2/genética , Canal de Potasio KCNQ2/metabolismo , Canal de Potasio KCNQ3/genética , Canal de Potasio KCNQ3/metabolismo , Masculino , Ratones , Narcolepsia/genética , Narcolepsia/fisiopatología , Proteínas del Tejido Nervioso/genética , Proteínas del Tejido Nervioso/metabolismo , Vías Nerviosas , Optogenética , Técnicas de Placa-Clamp , RNA-Seq , Calidad del SueñoRESUMEN
We analyzed 131 human brains (44 neurotypical, 19 with Tourette syndrome, 9 with schizophrenia, and 59 with autism) for somatic mutations after whole genome sequencing to a depth of more than 200×. Typically, brains had 20 to 60 detectable single-nucleotide mutations, but ~6% of brains harbored hundreds of somatic mutations. Hypermutability was associated with age and damaging mutations in genes implicated in cancers and, in some brains, reflected in vivo clonal expansions. Somatic duplications, likely arising during development, were found in ~5% of normal and diseased brains, reflecting background mutagenesis. Brains with autism were associated with mutations creating putative transcription factor binding motifs in enhancer-like regions in the developing brain. The top-ranked affected motifs corresponded to MEIS (myeloid ectopic viral integration site) transcription factors, suggesting a potential link between their involvement in gene regulation and autism.
Asunto(s)
Envejecimiento , Trastorno Autístico , Encéfalo , Mutagénesis , Factores de Transcripción , Envejecimiento/genética , Trastorno Autístico/genética , Elementos de Facilitación Genéticos/genética , Regulación de la Expresión Génica , Humanos , Mutación , Unión Proteica/genética , Factores de Transcripción/genética , Secuenciación Completa del GenomaRESUMEN
BACKGROUND: The 15q13.3 microdeletion is associated with several neuropsychiatric disorders, including autism and schizophrenia. Previous association and functional studies have investigated the potential role of several genes within the deletion in neuronal dysfunction, but the molecular effects of the deletion as a whole remain largely unknown. METHODS: Induced pluripotent stem cells, from 3 patients with the 15q13.3 microdeletion and 3 control subjects, were generated and converted into induced neurons. We analyzed the effects of the 15q13.3 microdeletion on genome-wide gene expression, DNA methylation, chromatin accessibility, and sensitivity to cisplatin-induced DNA damage. Furthermore, we measured gene expression changes in induced neurons with CRISPR (clustered regularly interspaced short palindromic repeats) knockouts of individual 15q13.3 microdeletion genes. RESULTS: In both induced pluripotent stem cells and induced neurons, gene copy number change within the 15q13.3 microdeletion was accompanied by significantly decreased gene expression and no compensatory changes in DNA methylation or chromatin accessibility, supporting the model that haploinsufficiency of genes within the deleted region drives the disorder. Furthermore, we observed global effects of the microdeletion on the transcriptome and epigenome, with disruptions in several neuropsychiatric disorder-associated pathways and gene families, including Wnt signaling, ribosome function, DNA binding, and clustered protocadherins. Individual gene knockouts mirrored many of the observed changes in an overlapping fashion between knockouts. CONCLUSIONS: Our multiomics analysis of the 15q13.3 microdeletion revealed downstream effects in pathways previously associated with neuropsychiatric disorders and indications of interactions between genes within the deletion. This molecular systems analysis can be applied to other chromosomal aberrations to further our etiological understanding of neuropsychiatric disorders.
Asunto(s)
Trastornos de los Cromosomas , Epigenoma , Deleción Cromosómica , Trastornos de los Cromosomas/genética , Cromosomas Humanos Par 15/genética , Humanos , Discapacidad Intelectual , Neuronas , Convulsiones , TranscriptomaRESUMEN
Structural variation in the complement 4 gene (C4) confers genetic risk for schizophrenia. The variation includes numbers of the increased C4A copy number, which predicts increased C4A mRNA expression. C4-anaphylatoxin (C4-ana) is a C4 protein fragment released upon C4 protein activation that has the potential to change the blood-brain barrier (BBB). We hypothesized that elevated plasma levels of C4-ana occur in individuals with schizophrenia (iSCZ). Blood was collected from 15 iSCZ with illness duration < 5 years and from 14 healthy controls (HC). Plasma C4-ana was measured by radioimmunoassay. Other complement activation products C3-ana, C5-ana, and terminal complement complex (TCC) were also measured. Digital-droplet PCR was used to determine C4 gene structural variation state. Recombinant C4-ana was added to primary brain endothelial cells (BEC) and permeability was measured in vitro. C4-ana concentration was elevated in plasma from iSCZ compared to HC (mean = 654 ± 16 ng/mL, 557 ± 94 respectively, p = 0.01). The patients also carried more copies of the C4AL gene and demonstrated a positive correlation between plasma C4-ana concentrations and C4A gene copy number. Furthermore, C4-ana increased the permeability of a monolayer of BEC in vitro. Our findings are consistent with a specific role for C4A protein in schizophrenia and raise the possibility that its activation product, C4-ana, increases BBB permeability. Exploratory analyses suggest the novel hypothesis that the relationship between C4-ana levels and C4A gene copy number could also be altered in iSCZ, suggesting an interaction with unknown genetic and/or environmental risk factors.
Asunto(s)
Complemento C4 , Esquizofrenia , Complemento C4/genética , Complemento C4a/genética , Células Endoteliales , Predisposición Genética a la Enfermedad , Humanos , Esquizofrenia/sangre , Esquizofrenia/genéticaRESUMEN
Retrotransposons can cause somatic genome variation in the human nervous system, which is hypothesized to have relevance to brain development and neuropsychiatric disease. However, the detection of individual somatic mobile element insertions presents a difficult signal-to-noise problem. Using a machine-learning method (RetroSom) and deep whole-genome sequencing, we analyzed L1 and Alu retrotransposition in sorted neurons and glia from human brains. We characterized two brain-specific L1 insertions in neurons and glia from a donor with schizophrenia. There was anatomical distribution of the L1 insertions in neurons and glia across both hemispheres, indicating retrotransposition occurred during early embryogenesis. Both insertions were within the introns of genes (CNNM2 and FRMD4A) inside genomic loci associated with neuropsychiatric disorders. Proof-of-principle experiments revealed these L1 insertions significantly reduced gene expression. These results demonstrate that RetroSom has broad applications for studies of brain development and may provide insight into the possible pathological effects of somatic retrotransposition.
Asunto(s)
Aprendizaje Automático , Mutagénesis Insercional/genética , Neuroglía , Neuronas , Proteínas Adaptadoras Transductoras de Señales/genética , Adulto , Proteínas de Transporte de Catión/genética , Desarrollo Embrionario/genética , Femenino , Genoma/genética , Células HeLa , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Elementos de Nucleótido Esparcido Largo , Trastornos Mentales/genética , Embarazo , Retroelementos , Esquizofrenia/genéticaRESUMEN
BACKGROUND: Post-zygotic mutations incurred during DNA replication, DNA repair, and other cellular processes lead to somatic mosaicism. Somatic mosaicism is an established cause of various diseases, including cancers. However, detecting mosaic variants in DNA from non-cancerous somatic tissues poses significant challenges, particularly if the variants only are present in a small fraction of cells. RESULTS: Here, the Brain Somatic Mosaicism Network conducts a coordinated, multi-institutional study to examine the ability of existing methods to detect simulated somatic single-nucleotide variants (SNVs) in DNA mixing experiments, generate multiple replicates of whole-genome sequencing data from the dorsolateral prefrontal cortex, other brain regions, dura mater, and dural fibroblasts of a single neurotypical individual, devise strategies to discover somatic SNVs, and apply various approaches to validate somatic SNVs. These efforts lead to the identification of 43 bona fide somatic SNVs that range in variant allele fractions from ~ 0.005 to ~ 0.28. Guided by these results, we devise best practices for calling mosaic SNVs from 250× whole-genome sequencing data in the accessible portion of the human genome that achieve 90% specificity and sensitivity. Finally, we demonstrate that analysis of multiple bulk DNA samples from a single individual allows the reconstruction of early developmental cell lineage trees. CONCLUSIONS: This study provides a unified set of best practices to detect somatic SNVs in non-cancerous tissues. The data and methods are freely available to the scientific community and should serve as a guide to assess the contributions of somatic SNVs to neuropsychiatric diseases.
Asunto(s)
Encéfalo/metabolismo , Estudios de Asociación Genética , Variación Genética , Alelos , Mapeo Cromosómico , Biología Computacional/métodos , Estudios de Asociación Genética/métodos , Genómica/métodos , Células Germinativas/metabolismo , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Especificidad de Órganos/genética , Polimorfismo de Nucleótido SimpleRESUMEN
We produced an extensive collection of deep re-sequencing datasets for the Venter/HuRef genome using the Illumina massively-parallel DNA sequencing platform. The original Venter genome sequence is a very-high quality phased assembly based on Sanger sequencing. Therefore, researchers developing novel computational tools for the analysis of human genome sequence variation for the dominant Illumina sequencing technology can test and hone their algorithms by making variant calls from these Venter/HuRef datasets and then immediately confirm the detected variants in the Sanger assembly, freeing them of the need for further experimental validation. This process also applies to implementing and benchmarking existing genome analysis pipelines. We prepared and sequenced 200 bp and 350 bp short-insert whole-genome sequencing libraries (sequenced to 100x and 40x genomic coverages respectively) as well as 2 kb, 5 kb, and 12 kb mate-pair libraries (49x, 122x, and 145x physical coverages respectively). Lastly, we produced a linked-read library (128x physical coverage) from which we also performed haplotype phasing.
Asunto(s)
Benchmarking/métodos , Genoma Humano , Secuenciación de Nucleótidos de Alto Rendimiento , Análisis de Secuencia de ADN/normas , Algoritmos , Biblioteca de Genes , Variación Genética , HumanosRESUMEN
Here, we describe approaches using droplet digital polymerase chain reaction (ddPCR) to validate and quantify somatic mosaic events contributed by transposable-element insertions, copy-number variants, and single-nucleotide variants. In the ddPCR assay, sample or template DNA is partitioned into tens of thousands of individual droplets such that when DNA input is low, the vast majority of droplets contains no more than one copy of template DNA. PCR takes place in each individual droplet and produces a fluorescent readout to indicate the presence or absence of the target of interest allowing for the accurate "counting" of the number of copies present in the sample. The number of partitions is large enough to assay somatic mosaic events with frequencies down to less than 1%.
Asunto(s)
Variaciones en el Número de Copia de ADN/genética , Elementos Transponibles de ADN/genética , ADN/aislamiento & purificación , Mosaicismo , Reacción en Cadena de la Polimerasa/métodos , ADN/genética , Humanos , Polimorfismo de Nucleótido SimpleRESUMEN
Microduplication of chromosome 1q21.1 is observed in ~0.03% of adults. It has a highly variable, incompletely penetrant phenotype that can include intellectual disability, global developmental delay, specific learning disabilities, autism, schizophrenia, heart anomalies and dysmorphic features. We evaluated a 10-year-old-male with a 1q21.1 duplication by CGH microarray. He presented with major attention deficits, phonological dysphasia, poor fine motor skills, dysmorphia and mild autistic features, but not the typical macrocephaly. Neuropsychiatric evaluation demonstrated a novel phenotype: an unusually large discrepancy between non-verbal capacities (borderline-impaired WISC-IV index scores of 70 for Working Memory and 68 for Processing Speed) vs. strong verbal skills - scores of 126 for Verbal Comprehension (superior) and 111 for Perceptual Reasoning (normal). HYDIN2 has been hypothesized to underlie macrocephaly and perhaps cognitive deficits in this syndrome, but assessment of HYDIN2 copy number by microarray is difficult because of extensive segmental duplications. We performed whole-genome sequencing which supported HYDIN2 duplication (chr1:146,370,001-148,590,000, 2.22 Mb, hg38). To evaluate copy number more rigorously we developed droplet digital PCR assays of HYDIN2 (targeting unique 1 kb and 6 kb insertions) and its paralog HYDIN (targeting a unique 154 bp segment outside the HYDIN2 overlap). In an independent cohort, ddPCR was concordant with previous microarray data. Duplication of HYDIN2 was confirmed in the patient by ddPCR. This case demonstrates that a large discrepancy of verbal and non-verbal abilities can occur in 1q21.1 duplication syndrome, but it remains unclear whether this has a specific genomic basis. These ddPCR assays may be useful for future research on HYDIN2 copy number.
RESUMEN
Somatic mosaicism in the human brain may alter function of individual neurons. We analyzed genomes of single cells from the forebrains of three human fetuses (15 to 21 weeks postconception) using clonal cell populations. We detected 200 to 400 single-nucleotide variations (SNVs) per cell. SNV patterns resembled those found in cancer cell genomes, indicating a role of background mutagenesis in cancer. SNVs with a frequency of >2% in brain were also present in the spleen, revealing a pregastrulation origin. We reconstructed cell lineages for the first five postzygotic cleavages and calculated a mutation rate of ~1.3 mutations per division per cell. Later in development, during neurogenesis, the mutation spectrum shifted toward oxidative damage, and the mutation rate increased. Both neurogenesis and early embryogenesis exhibit substantially more mutagenesis than adulthood.
Asunto(s)
Encéfalo/embriología , Gastrulación/genética , Mosaicismo , Mutagénesis , Tasa de Mutación , Neurogénesis/genética , Linaje de la Célula/genética , Genoma Humano , Humanos , Mutación , Neoplasias/genética , Neuronas , Polimorfismo de Nucleótido Simple , Análisis de la Célula IndividualRESUMEN
Three common mutations in the CARD15 (NOD2) gene are known to be associated with susceptibility to Crohn disease (CD), and genetic data suggest a gene dosage model with an increased risk of 2-4-fold in heterozygotes and 20-40-fold in homozygotes. However, the discovery of numerous rare variants of CARD15 indicates that some heterozygotes for the common mutations have a rare mutation on the other CARD15 allele, which would support a recessive model for CD. We addressed this issue by screening CARD15 for mutations in 100 CD patients who were heterozygous for one of the three common mutations. We also developed a strategy for evaluating potential disease susceptibility alleles (DSAs) that involves assessing the degree of evolutionary conservation of involved residues, predicted effects on protein structure and function, and genotyping in a large sample of cases and controls. The evolutionary analysis was aided by sequencing the entire coding region of CARD15 in three primates (chimp, gibbon, and tamarin) and aligning the human sequence with these and orthologs from other species. We found that 11 of the 100 CD patients screened had a second potential pathogenic mutation within the exonic and periexonic sequences examined. Assuming that there are no additional pathogenic mutations in noncoding regions, our study suggests that most carriers of the common DSAs are true heterozygotes, and supports previous evidence for a gene dosage model. Four novel nonsynonymous mutations were detected, one of which would produce premature termination of translation c.2686C>T (p.Arg896X). Two potential DSAs--c.2107C>T (p.Arg703Cys) and g.2238T>A (c.74-7T>A)--were significantly associated with CD in the case control sample. Analysis of the evolution of CARD15 revealed strong conservation of the encoded protein, with identity to the human sequence ranging from 99.1% in the chimp to 44.5% in fugu. Higher primates possess an open reading frame (ORF) upstream of the putative initiation site in other species that encodes a further 27 N-terminal amino acids, while four regions of high conservation are observed outside of the known domains of CARD15, indicative of additional residues of functional importance. The strategy developed here may have general application to the assessment of mutation pathogenicity and genetic models in other complex disorders.