Pesquisa | BVS Integralidade em Saúde

1.

Cell-free DNA Comprises an In Vivo Nucleosome Footprint that Informs Its Tissues-Of-Origin.

Snyder, Matthew W; Kircher, Martin; Hill, Andrew J; Daza, Riza M; Shendure, Jay.

Cell ; 164(1-2): 57-68, 2016 Jan 14.

Artigo em Inglês | MEDLINE | ID: mdl-26771485

RESUMO

Nucleosome positioning varies between cell types. By deep sequencing cell-free DNA (cfDNA), isolated from circulating blood plasma, we generated maps of genome-wide in vivo nucleosome occupancy and found that short cfDNA fragments harbor footprints of transcription factors. The cfDNA nucleosome occupancies correlate well with the nuclear architecture, gene structure, and expression observed in cells, suggesting that they could inform the cell type of origin. Nucleosome spacing inferred from cfDNA in healthy individuals correlates most strongly with epigenetic features of lymphoid and myeloid cells, consistent with hematopoietic cell death as the normal source of cfDNA. We build on this observation to show how nucleosome footprints can be used to infer cell types contributing to cfDNA in pathological states such as cancer. Since this strategy does not rely on genetic differences to distinguish between contributing tissues, it may enable the noninvasive monitoring of a much broader set of clinical conditions than currently possible.

Assuntos

DNA/química , Nucleossomos/química , Especificidade de Órgãos , Fator de Ligação a CCCTC , Linhagem Celular , Montagem e Desmontagem da Cromatina , DNA/metabolismo , Pegada de DNA , Genoma Humano , Estudo de Associação Genômica Ampla , Humanos , Neoplasias/genética , Proteínas Repressoras/metabolismo , Análise de Sequência de DNA

2.

Aberrant phase separation and nucleolar dysfunction in rare genetic diseases.

Mensah, Martin A; Niskanen, Henri; Magalhaes, Alexandre P; Basu, Shaon; Kircher, Martin; Sczakiel, Henrike L; Reiter, Alisa M V; Elsner, Jonas; Meinecke, Peter; Biskup, Saskia; Chung, Brian H Y; Dombrowsky, Gregor; Eckmann-Scholz, Christel; Hitz, Marc Phillip; Hoischen, Alexander; Holterhus, Paul-Martin; Hülsemann, Wiebke; Kahrizi, Kimia; Kalscheuer, Vera M; Kan, Anita; Krumbiegel, Mandy; Kurth, Ingo; Leubner, Jonas; Longardt, Ann Carolin; Moritz, Jörg D; Najmabadi, Hossein; Skipalova, Karolina; Snijders Blok, Lot; Tzschach, Andreas; Wiedersberg, Eberhard; Zenker, Martin; Garcia-Cabau, Carla; Buschow, René; Salvatella, Xavier; Kraushar, Matthew L; Mundlos, Stefan; Caliebe, Almuth; Spielmann, Malte; Horn, Denise; Hnisz, Denes.

Nature ; 614(7948): 564-571, 2023 02.

Artigo em Inglês | MEDLINE | ID: mdl-36755093

RESUMO

Thousands of genetic variants in protein-coding genes have been linked to disease. However, the functional impact of most variants is unknown as they occur within intrinsically disordered protein regions that have poorly defined functions1-3. Intrinsically disordered regions can mediate phase separation and the formation of biomolecular condensates, such as the nucleolus4,5. This suggests that mutations in disordered proteins may alter condensate properties and function6-8. Here we show that a subset of disease-associated variants in disordered regions alter phase separation, cause mispartitioning into the nucleolus and disrupt nucleolar function. We discover de novo frameshift variants in HMGB1 that cause brachyphalangy, polydactyly and tibial aplasia syndrome, a rare complex malformation syndrome. The frameshifts replace the intrinsically disordered acidic tail of HMGB1 with an arginine-rich basic tail. The mutant tail alters HMGB1 phase separation, enhances its partitioning into the nucleolus and causes nucleolar dysfunction. We built a catalogue of more than 200,000 variants in disordered carboxy-terminal tails and identified more than 600 frameshifts that create arginine-rich basic tails in transcription factors and other proteins. For 12 out of the 13 disease-associated variants tested, the mutation enhanced partitioning into the nucleolus, and several variants altered rRNA biogenesis. These data identify the cause of a rare complex syndrome and suggest that a large number of genetic variants may dysregulate nucleoli and other biomolecular condensates in humans.

Assuntos

Nucléolo Celular , Proteína HMGB1 , Humanos , Arginina/genética , Arginina/metabolismo , Nucléolo Celular/genética , Nucléolo Celular/metabolismo , Nucléolo Celular/patologia , Proteína HMGB1/química , Proteína HMGB1/genética , Proteína HMGB1/metabolismo , Proteínas Intrinsicamente Desordenadas/química , Proteínas Intrinsicamente Desordenadas/genética , Proteínas Intrinsicamente Desordenadas/metabolismo , Síndrome , Mutação da Fase de Leitura , Transição de Fase

3.

STIGMA: Single-cell tissue-specific gene prioritization using machine learning.

Balachandran, Saranya; Prada-Medina, Cesar A; Mensah, Martin A; Kakar, Naseebullah; Nagel, Inga; Pozojevic, Jelena; Audain, Enrique; Hitz, Marc-Phillip; Kircher, Martin; Sreenivasan, Varun K A; Spielmann, Malte.

Am J Hum Genet ; 111(2): 338-349, 2024 02 01.

Artigo em Inglês | MEDLINE | ID: mdl-38228144

RESUMO

Clinical exome and genome sequencing have revolutionized the understanding of human disease genetics. Yet many genes remain functionally uncharacterized, complicating the establishment of causal disease links for genetic variants. While several scoring methods have been devised to prioritize these candidate genes, these methods fall short of capturing the expression heterogeneity across cell subpopulations within tissues. Here, we introduce single-cell tissue-specific gene prioritization using machine learning (STIGMA), an approach that leverages single-cell RNA-seq (scRNA-seq) data to prioritize candidate genes associated with rare congenital diseases. STIGMA prioritizes genes by learning the temporal dynamics of gene expression across cell types during healthy organogenesis. To assess the efficacy of our framework, we applied STIGMA to mouse limb and human fetal heart scRNA-seq datasets. In a cohort of individuals with congenital limb malformation, STIGMA prioritized 469 variants in 345 genes, with UBA2 as a notable example. For congenital heart defects, we detected 34 genes harboring nonsynonymous de novo variants (nsDNVs) in two or more individuals from a set of 7,958 individuals, including the ortholog of Prdm1, which is associated with hypoplastic left ventricle and hypoplastic aortic arch. Overall, our findings demonstrate that STIGMA effectively prioritizes tissue-specific candidate genes by utilizing single-cell transcriptome data. The ability to capture the heterogeneity of gene expression across cell populations makes STIGMA a powerful tool for the discovery of disease-associated genes and facilitates the identification of causal variants underlying human genetic disorders.

Assuntos

Cardiopatias Congênitas , Transcriptoma , Humanos , Animais , Camundongos , Exoma/genética , Cardiopatias Congênitas/genética , Sequenciamento do Exoma , Aprendizado de Máquina , Análise de Célula Única/métodos , Enzimas Ativadoras de Ubiquitina/genética

4.

CADD v1.7: using protein language models, regulatory CNNs and other nucleotide-level scores to improve genome-wide variant predictions.

Schubach, Max; Maass, Thorben; Nazaretyan, Lusiné; Röner, Sebastian; Kircher, Martin.

Nucleic Acids Res ; 52(D1): D1143-D1154, 2024 Jan 05.

Artigo em Inglês | MEDLINE | ID: mdl-38183205

RESUMO

Machine Learning-based scoring and classification of genetic variants aids the assessment of clinical findings and is employed to prioritize variants in diverse genetic studies and analyses. Combined Annotation-Dependent Depletion (CADD) is one of the first methods for the genome-wide prioritization of variants across different molecular functions and has been continuously developed and improved since its original publication. Here, we present our most recent release, CADD v1.7. We explored and integrated new annotation features, among them state-of-the-art protein language model scores (Meta ESM-1v), regulatory variant effect predictions (from sequence-based convolutional neural networks) and sequence conservation scores (Zoonomia). We evaluated the new version on data sets derived from ClinVar, ExAC/gnomAD and 1000 Genomes variants. For coding effects, we tested CADD on 31 Deep Mutational Scanning (DMS) data sets from ProteinGym and, for regulatory effect prediction, we used saturation mutagenesis reporter assay data of promoter and enhancer sequences. The inclusion of new features further improved the overall performance of CADD. As with previous releases, all data sets, genome-wide CADD v1.7 scores, scripts for on-site scoring and an easy-to-use webserver are readily provided via https://cadd.bihealth.org/ or https://cadd.gs.washington.edu/ to the community.

Assuntos

Variação Genética , Genoma Humano , Aprendizado de Máquina , Software , Nucleotídeos , Humanos

5.

A framework to score the effects of structural variants in health and disease.

Kleinert, Philip; Kircher, Martin.

Genome Res ; 32(4): 766-777, 2022 04.

Artigo em Inglês | MEDLINE | ID: mdl-35197310

RESUMO

Although technological advances improved the identification of structural variants (SVs) in the human genome, their interpretation remains challenging. Several methods utilize individual mechanistic principles like the deletion of coding sequence or 3D genome architecture disruptions. However, a comprehensive tool using the broad spectrum of available annotations is missing. Here, we describe CADD-SV, a method to retrieve and integrate a wide set of annotations to predict the effects of SVs. Previously, supervised learning approaches were limited due to a small number and biased set of annotated pathogenic or benign SVs. We overcome this problem by using a surrogate training objective, the Combined Annotation Dependent Depletion (CADD) of functional variants. We use human- and chimpanzee-derived SVs as proxy-neutral and contrast them with matched simulated variants as proxy-deleterious, an approach that has proven powerful for short sequence variants. Our tool computes summary statistics over diverse variant annotations and uses random forest models to prioritize deleterious structural variants. The resulting CADD-SV scores correlate with known pathogenic and rare population variants. We further show that we can prioritize somatic cancer variants as well as noncoding variants known to affect gene expression. We provide a website and offline-scoring tool for easy application of CADD-SV.

Assuntos

Genoma Humano , Humanos

6.

Predicting the pathogenicity of missense variants using features derived from AlphaFold2.

Schmidt, Axel; Röner, Sebastian; Mai, Karola; Klinkhammer, Hannah; Kircher, Martin; Ludwig, Kerstin U.

Bioinformatics ; 39(5)2023 05 04.

Artigo em Inglês | MEDLINE | ID: mdl-37084271

RESUMO

MOTIVATION: Missense variants are a frequent class of variation within the coding genome, and some of them cause Mendelian diseases. Despite advances in computational prediction, classifying missense variants into pathogenic or benign remains a major challenge in the context of personalized medicine. Recently, the structure of the human proteome was derived with unprecedented accuracy using the artificial intelligence system AlphaFold2. This raises the question of whether AlphaFold2 wild-type structures can improve the accuracy of computational pathogenicity prediction for missense variants. RESULTS: To address this, we first engineered a set of features for each amino acid from these structures. We then trained a random forest to distinguish between relatively common (proxy-benign) and singleton (proxy-pathogenic) missense variants from gnomAD v3.1. This yielded a novel AlphaFold2-based pathogenicity prediction score, termed AlphScore. Important feature classes used by AlphScore are solvent accessibility, amino acid network related features, features describing the physicochemical environment, and AlphaFold2's quality parameter (predicted local distance difference test). AlphScore alone showed lower performance than existing in silico scores used for missense prediction, such as CADD or REVEL. However, when AlphScore was added to those scores, the performance increased, as measured by the approximation of deep mutational scan data, as well as the prediction of expert-curated missense variants from the ClinVar database. Overall, our data indicate that the integration of AlphaFold2-predicted structures can improve pathogenicity prediction of missense variants. AVAILABILITY AND IMPLEMENTATION: AlphScore, combinations of AlphScore with existing scores, as well as variants used for training and testing are publicly available.

Assuntos

Inteligência Artificial , Biologia Computacional , Humanos , Virulência , Mutação de Sentido Incorreto , Mutação

7.

A systematic evaluation of the design and context dependencies of massively parallel reporter assays.

Klein, Jason C; Agarwal, Vikram; Inoue, Fumitaka; Keith, Aidan; Martin, Beth; Kircher, Martin; Ahituv, Nadav; Shendure, Jay.

Nat Methods ; 17(11): 1083-1091, 2020 11.

Artigo em Inglês | MEDLINE | ID: mdl-33046894

RESUMO

Massively parallel reporter assays (MPRAs) functionally screen thousands of sequences for regulatory activity in parallel. To date, there are limited studies that systematically compare differences in MPRA design. Here, we screen a library of 2,440 candidate liver enhancers and controls for regulatory activity in HepG2 cells using nine different MPRA designs. We identify subtle but significant differences that correlate with epigenetic and sequence-level features, as well as differences in dynamic range and reproducibility. We also validate that enhancer activity is largely independent of orientation, at least for our library and designs. Finally, we assemble and test the same enhancers as 192-mers, 354-mers and 678-mers and observe sizable differences. This work provides a framework for the experimental design of high-throughput reporter assays, suggesting that the extended sequence context of tested elements and to a lesser degree the precise assay, influence MPRA results.

Assuntos

Biblioteca Gênica , Genes Reporter , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Sequências Reguladoras de Ácido Nucleico , Análise de Sequência de DNA/métodos , Elementos Facilitadores Genéticos , Células Hep G2 , Humanos , Reprodutibilidade dos Testes , Fatores de Transcrição/genética

8.

Boosting tissue-specific prediction of active cis-regulatory regions through deep learning and Bayesian optimization techniques.

Cappelletti, Luca; Petrini, Alessandro; Gliozzo, Jessica; Casiraghi, Elena; Schubach, Max; Kircher, Martin; Valentini, Giorgio.

BMC Bioinformatics ; 23(Suppl 2): 154, 2022 Dec 12.

Artigo em Inglês | MEDLINE | ID: mdl-36510125

RESUMO

BACKGROUND: Cis-regulatory regions (CRRs) are non-coding regions of the DNA that fine control the spatio-temporal pattern of transcription; they are involved in a wide range of pivotal processes such as the development of specific cell-lines/tissues and the dynamic cell response to physiological stimuli. Recent studies showed that genetic variants occurring in CRRs are strongly correlated with pathogenicity or deleteriousness. Considering the central role of CRRs in the regulation of physiological and pathological conditions, the correct identification of CRRs and of their tissue-specific activity status through Machine Learning methods plays a major role in dissecting the impact of genetic variants on human diseases. Unfortunately, the problem is still open, though some promising results have been already reported by (deep) machine-learning based methods that predict active promoters and enhancers in specific tissues or cell lines by encoding epigenetic or spectral features directly extracted from DNA sequences. RESULTS: We present the experiments we performed to compare two Deep Neural Networks, a Feed-Forward Neural Network model working on epigenomic features, and a Convolutional Neural Network model working only on genomic sequence, targeted to the identification of enhancer- and promoter-activity in specific cell lines. While performing experiments to understand how the experimental setup influences the prediction performance of the methods, we particularly focused on (1) automatic model selection performed by Bayesian optimization and (2) exploring different data rebalancing setups for reducing negative unbalancing effects. CONCLUSIONS: Results show that (1) automatic model selection by Bayesian optimization improves the quality of the learner; (2) data rebalancing considerably impacts the prediction performance of the models; test set rebalancing may provide over-optimistic results, and should therefore be cautiously applied; (3) despite working on sequence data, convolutional models obtain performance close to those of feed forward models working on epigenomic information, which suggests that also sequence data carries informative content for CRR-activity prediction. We therefore suggest combining both models/data types in future works.

Assuntos

Aprendizado Profundo , Humanos , Teorema de Bayes , Sequências Reguladoras de Ácido Nucleico , Redes Neurais de Computação , Aprendizado de Máquina

9.

STIGMA: Single-cell tissue-specific gene prioritization using machine learning.

Balachandran, Saranya; Prada-Medina, Cesar A; Mensah, Martin A; Glaser, Juliane; Kakar, Naseebullah; Nagel, Inga; Pozojevic, Jelena; Audain, Enrique; Hitz, Marc-Phillip; Kircher, Martin; Sreenivasan, Varun K A; Spielmann, Malte.

Am J Hum Genet ; 111(3): 618, 2024 Mar 07.

Artigo em Inglês | MEDLINE | ID: mdl-38458167

10.

GGC Repeat Expansion and Exon 1 Methylation of XYLT1 Is a Common Pathogenic Variant in Baratela-Scott Syndrome.

LaCroix, Amy J; Stabley, Deborah; Sahraoui, Rebecca; Adam, Margaret P; Mehaffey, Michele; Kernan, Kelly; Myers, Candace T; Fagerstrom, Carrie; Anadiotis, George; Akkari, Yassmine M; Robbins, Katherine M; Gripp, Karen W; Baratela, Wagner A R; Bober, Michael B; Duker, Angela L; Doherty, Dan; Dempsey, Jennifer C; Miller, Daniel G; Kircher, Martin; Bamshad, Michael J; Nickerson, Deborah A; Mefford, Heather C; Sol-Church, Katia.

Am J Hum Genet ; 104(1): 35-44, 2019 01 03.

Artigo em Inglês | MEDLINE | ID: mdl-30554721

RESUMO

Baratela-Scott syndrome (BSS) is a rare, autosomal-recessive disorder characterized by short stature, facial dysmorphisms, developmental delay, and skeletal dysplasia caused by pathogenic variants in XYLT1. We report clinical and molecular investigation of 10 families (12 individuals) with BSS. Standard sequencing methods identified biallelic pathogenic variants in XYLT1 in only two families. Of the remaining cohort, two probands had no variants and six probands had only a single variant, including four with a heterozygous 3.1 Mb 16p13 deletion encompassing XYLT1 and two with a heterozygous truncating variant. Bisulfite sequencing revealed aberrant hypermethylation in exon 1 of XYLT1, always in trans with the sequence variant or deletion when present; both alleles were methylated in those with no identified variant. Expression of the methylated XYLT1 allele was severely reduced in fibroblasts from two probands. Southern blot studies combined with repeat expansion analysis of genome sequence data showed that the hypermethylation is associated with expansion of a GGC repeat in the XYLT1 promoter region that is not present in the reference genome, confirming that BSS is a trinucleotide repeat expansion disorder. The hypermethylated allele accounts for 50% of disease alleles in our cohort and is not present in 130 control subjects. Our study highlights the importance of investigating non-sequence-based alterations, including epigenetic changes, to identify the missing heritability in genetic disorders.

Assuntos

Anormalidades Múltiplas/genética , Metilação de DNA/genética , Epigênese Genética/genética , Éxons/genética , Mutação , Pentosiltransferases/genética , Expansão das Repetições de Trinucleotídeos/genética , Alelos , Southern Blotting , Estudos de Coortes , Feminino , Humanos , Lactente , Recém-Nascido , Masculino , Linhagem , Sulfitos/metabolismo , Síndrome , UDP Xilose-Proteína Xilosiltransferase

11.

Ancient gene flow from early modern humans into Eastern Neanderthals.

Kuhlwilm, Martin; Gronau, Ilan; Hubisz, Melissa J; de Filippo, Cesare; Prado-Martinez, Javier; Kircher, Martin; Fu, Qiaomei; Burbano, Hernán A; Lalueza-Fox, Carles; de la Rasilla, Marco; Rosas, Antonio; Rudan, Pavao; Brajkovic, Dejana; Kucan, Zeljko; Gusic, Ivan; Marques-Bonet, Tomas; Andrés, Aida M; Viola, Bence; Pääbo, Svante; Meyer, Matthias; Siepel, Adam; Castellano, Sergi.

Nature ; 530(7591): 429-33, 2016 Feb 25.

Artigo em Inglês | MEDLINE | ID: mdl-26886800

RESUMO

It has been shown that Neanderthals contributed genetically to modern humans outside Africa 47,000-65,000 years ago. Here we analyse the genomes of a Neanderthal and a Denisovan from the Altai Mountains in Siberia together with the sequences of chromosome 21 of two Neanderthals from Spain and Croatia. We find that a population that diverged early from other modern humans in Africa contributed genetically to the ancestors of Neanderthals from the Altai Mountains roughly 100,000 years ago. By contrast, we do not detect such a genetic contribution in the Denisovan or the two European Neanderthals. We conclude that in addition to later interbreeding events, the ancestors of Neanderthals from the Altai Mountains and early modern humans met and interbred, possibly in the Near East, many thousands of years earlier than previously thought.

Assuntos

Fluxo Gênico/genética , Homem de Neandertal/genética , Altitude , Animais , Teorema de Bayes , Cromossomos Humanos Par 21/genética , Croácia/etnologia , Genoma Humano/genética , Genômica , Haplótipos/genética , Heterozigoto , Humanos , Hibridização Genética/genética , Filogenia , Densidade Demográfica , Sibéria , Espanha/etnologia , Fatores de Tempo

12.

Systematic assays and resources for the functional annotation of non-coding variants.

Kircher, Martin; Ludwig, Kerstin U.

Med Genet ; 34(4): 275-286, 2022 Dec 31.

Artigo em Inglês | MEDLINE | ID: mdl-37034418

RESUMO

Identification of genetic variation in individual genomes is now a routine procedure in human genetic research and diagnostics. For many variants, however, insufficient evidence is available to establish a pathogenic effect, particularly for variants in non-coding regions. Furthermore, the sheer number of candidate variants renders testing in individual assays virtually impossible. While scalable approaches are being developed, the selection of methods and resources, and the application of a given framework to a particular disease or trait remain major challenges. This limits the translation of results from both genome-wide association studies and genome sequencing. Here, we discuss computational and experimental approaches available for functional annotation of non-coding variation.

13.

Bi-allelic POLR3A Loss-of-Function Variants Cause Autosomal-Recessive Wiedemann-Rautenstrauch Syndrome.

Wambach, Jennifer A; Wegner, Daniel J; Patni, Nivedita; Kircher, Martin; Willing, Marcia C; Baldridge, Dustin; Xing, Chao; Agarwal, Anil K; Vergano, Samantha A Schrier; Patel, Chirag; Grange, Dorothy K; Kenney, Amy; Najaf, Tasnim; Nickerson, Deborah A; Bamshad, Michael J; Cole, F Sessions; Garg, Abhimanyu.

Am J Hum Genet ; 103(6): 968-975, 2018 12 06.

Artigo em Inglês | MEDLINE | ID: mdl-30414627

RESUMO

Wiedemann-Rautenstrauch syndrome (WRS), also known as neonatal progeroid syndrome, is a rare disorder of unknown etiology. It has been proposed to be autosomal-recessive and is characterized by variable clinical features, such as intrauterine growth restriction and poor postnatal weight gain, characteristic facial features (triangular appearance to the face, convex nasal profile or pinched nose, and small mouth), widened fontanelles, pseudohydrocephalus, prominent scalp veins, lipodystrophy, and teeth abnormalities. A previous report described a single WRS patient with bi-allelic truncating and splicing variants in POLR3A. Here we present seven additional infants, children, and adults with WRS and bi-allelic truncating and/or splicing variants in POLR3A. POLR3A, the largest subunit of RNA polymerase III, is a DNA-directed RNA polymerase that transcribes many small noncoding RNAs that regulate transcription, RNA processing, and translation. Bi-allelic missense variants in POLR3A have been associated with phenotypes distinct from WRS: hypogonadotropic hypogonadism and hypomyelinating leukodystrophy with or without oligodontia. Our findings confirm the association of bi-allelic POLR3A variants with WRS, expand the clinical phenotype of WRS, and suggest specific POLR3A genotypes associated with WRS and hypomyelinating leukodystrophy.

Assuntos

Retardo do Crescimento Fetal/genética , Variação Genética/genética , Perda de Heterozigosidade/genética , Progéria/genética , RNA Polimerase III/genética , Adolescente , Adulto , Alelos , Pré-Escolar , Feminino , Genótipo , Humanos , Fenótipo , Adulto Jovem

14.

HemoMIPs-Automated analysis and result reporting pipeline for targeted sequencing data.

Kleinert, Philip; Martin, Beth; Kircher, Martin.

PLoS Comput Biol ; 16(6): e1007956, 2020 06.

Artigo em Inglês | MEDLINE | ID: mdl-32497118

RESUMO

Targeted sequencing of genomic regions is a cost- and time-efficient approach for screening patient cohorts. We present a fast and efficient workflow to analyze highly imbalanced, targeted next-generation sequencing data generated using molecular inversion probe (MIP) capture. Our Snakemake pipeline performs sample demultiplexing, overlap paired-end merging, alignment, MIP-arm trimming, variant calling, coverage analysis and report generation. Further, we support the analysis of probes specifically designed to capture certain structural variants and can assign sex using Y-chromosome-unique probes. In a user-friendly HTML report, we summarize all these results including covered, incomplete or missing regions, called variants and their predicted effects. We developed and tested our pipeline using the hemophilia A & B MIP design from the "My Life, Our Future" initiative. HemoMIPs is available as an open-source tool on GitHub at: https://github.com/kircherlab/hemoMIPs.

Assuntos

Automação , Cromossomos Humanos Y , Testes Genéticos/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Estudos de Coortes , Humanos , Masculino , Linguagens de Programação

15.

CADD: predicting the deleteriousness of variants throughout the human genome.

Rentzsch, Philipp; Witten, Daniela; Cooper, Gregory M; Shendure, Jay; Kircher, Martin.

Nucleic Acids Res ; 47(D1): D886-D894, 2019 01 08.

Artigo em Inglês | MEDLINE | ID: mdl-30371827

RESUMO

Combined Annotation-Dependent Depletion (CADD) is a widely used measure of variant deleteriousness that can effectively prioritize causal variants in genetic analyses, particularly highly penetrant contributors to severe Mendelian disorders. CADD is an integrative annotation built from more than 60 genomic features, and can score human single nucleotide variants and short insertion and deletions anywhere in the reference assembly. CADD uses a machine learning model trained on a binary distinction between simulated de novo variants and variants that have arisen and become fixed in human populations since the split between humans and chimpanzees; the former are free of selective pressure and may thus include both neutral and deleterious alleles, while the latter are overwhelmingly neutral (or, at most, weakly deleterious) by virtue of having survived millions of years of purifying selection. Here we review the latest updates to CADD, including the most recent version, 1.4, which supports the human genome build GRCh38. We also present updates to our website that include simplified variant lookup, extended documentation, an Application Program Interface and improved mechanisms for integrating CADD scores into other tools or applications. CADD scores, software and documentation are available at https://cadd.gs.washington.edu.

Assuntos

Bases de Dados de Ácidos Nucleicos , Variação Genética , Genoma Humano , Humanos , Aprendizado de Máquina , Anotação de Sequência Molecular

16.

A systematic comparison reveals substantial differences in chromosomal versus episomal encoding of enhancer activity.

Inoue, Fumitaka; Kircher, Martin; Martin, Beth; Cooper, Gregory M; Witten, Daniela M; McManus, Michael T; Ahituv, Nadav; Shendure, Jay.

Genome Res ; 27(1): 38-52, 2017 01.

Artigo em Inglês | MEDLINE | ID: mdl-27831498

RESUMO

Candidate enhancers can be identified on the basis of chromatin modifications, the binding of chromatin modifiers and transcription factors and cofactors, or chromatin accessibility. However, validating such candidates as bona fide enhancers requires functional characterization, typically achieved through reporter assays that test whether a sequence can increase expression of a transcriptional reporter via a minimal promoter. A longstanding concern is that reporter assays are mainly implemented on episomes, which are thought to lack physiological chromatin. However, the magnitude and determinants of differences in cis-regulation for regulatory sequences residing in episomes versus chromosomes remain almost completely unknown. To address this systematically, we developed and applied a novel lentivirus-based massively parallel reporter assay (lentiMPRA) to directly compare the functional activities of 2236 candidate liver enhancers in an episomal versus a chromosomally integrated context. We find that the activities of chromosomally integrated sequences are substantially different from the activities of the identical sequences assayed on episomes, and furthermore are correlated with different subsets of ENCODE annotations. The results of chromosomally based reporter assays are also more reproducible and more strongly predictable by both ENCODE annotations and sequence-based models. With a linear model that combines chromatin annotations and sequence information, we achieve a Pearson's R2 of 0.362 for predicting the results of chromosomally integrated reporter assays. This level of prediction is better than with either chromatin annotations or sequence information alone and also outperforms predictive models of episomal assays. Our results have broad implications for how cis-regulatory elements are identified, prioritized and functionally validated.

Assuntos

Cromatina/genética , Elementos Facilitadores Genéticos/genética , Regulação da Expressão Gênica/genética , Plasmídeos/genética , Montagem e Desmontagem da Cromatina/genética , Cromossomos/genética , Genes Reporter , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Regiões Promotoras Genéticas , Sequências Reguladoras de Ácido Nucleico/genética , Fatores de Transcrição

17.

The complete genome sequence of a Neanderthal from the Altai Mountains.

Prüfer, Kay; Racimo, Fernando; Patterson, Nick; Jay, Flora; Sankararaman, Sriram; Sawyer, Susanna; Heinze, Anja; Renaud, Gabriel; Sudmant, Peter H; de Filippo, Cesare; Li, Heng; Mallick, Swapan; Dannemann, Michael; Fu, Qiaomei; Kircher, Martin; Kuhlwilm, Martin; Lachmann, Michael; Meyer, Matthias; Ongyerth, Matthias; Siebauer, Michael; Theunert, Christoph; Tandon, Arti; Moorjani, Priya; Pickrell, Joseph; Mullikin, James C; Vohr, Samuel H; Green, Richard E; Hellmann, Ines; Johnson, Philip L F; Blanche, Hélène; Cann, Howard; Kitzman, Jacob O; Shendure, Jay; Eichler, Evan E; Lein, Ed S; Bakken, Trygve E; Golovanova, Liubov V; Doronichev, Vladimir B; Shunkov, Michael V; Derevianko, Anatoli P; Viola, Bence; Slatkin, Montgomery; Reich, David; Kelso, Janet; Pääbo, Svante.

Nature ; 505(7481): 43-9, 2014 Jan 02.

Artigo em Inglês | MEDLINE | ID: mdl-24352235

RESUMO

We present a high-quality genome sequence of a Neanderthal woman from Siberia. We show that her parents were related at the level of half-siblings and that mating among close relatives was common among her recent ancestors. We also sequenced the genome of a Neanderthal from the Caucasus to low coverage. An analysis of the relationships and population history of available archaic genomes and 25 present-day human genomes shows that several gene flow events occurred among Neanderthals, Denisovans and early modern humans, possibly including gene flow into Denisovans from an unknown archaic group. Thus, interbreeding, albeit of low magnitude, occurred among many hominin groups in the Late Pleistocene. In addition, the high-quality Neanderthal genome allows us to establish a definitive list of substitutions that became fixed in modern humans after their separation from the ancestors of Neanderthals and Denisovans.

Assuntos

Fósseis , Genoma/genética , Homem de Neandertal/genética , África , Animais , Cavernas , Variações do Número de Cópias de DNA/genética , Feminino , Fluxo Gênico/genética , Frequência do Gene , Heterozigoto , Humanos , Endogamia , Modelos Genéticos , Homem de Neandertal/classificação , Filogenia , Densidade Demográfica , Sibéria/etnologia , Falanges dos Dedos do Pé/anatomia & histologia

18.

Concurrent genome and epigenome editing by CRISPR-mediated sequence replacement.

Alexander, Jes; Findlay, Gregory M; Kircher, Martin; Shendure, Jay.

BMC Biol ; 17(1): 90, 2019 11 18.

Artigo em Inglês | MEDLINE | ID: mdl-31739790

RESUMO

BACKGROUND: Recent advances in genome editing have facilitated the direct manipulation of not only the genome, but also the epigenome. Genome editing is typically performed by introducing a single CRISPR/Cas9-mediated double-strand break (DSB), followed by non-homologous end joining (NHEJ)- or homology-directed repair-mediated repair. Epigenome editing, and in particular methylation of CpG dinucleotides, can be performed using catalytically inactive Cas9 (dCas9) fused to a methyltransferase domain. However, for investigations of the role of methylation in gene silencing, studies based on dCas9-methyltransferase have limited resolution and are potentially confounded by the effects of binding of the fusion protein. As an alternative strategy for epigenome editing, we tested CRISPR/Cas9 dual cutting of the genome in the presence of in vitro methylated exogenous DNA, with the aim of driving replacement of the DNA sequence intervening the dual cuts via NHEJ. RESULTS: In a proof of concept at the HPRT1 promoter, successful replacement events with heavily methylated alleles of a CpG island resulted in functional silencing of the HPRT1 gene. Although still limited in efficiency, our study demonstrates concurrent epigenome and genome editing in a single event. CONCLUSIONS: This study opens the door to investigations of the functional consequences of methylation patterns at single CpG dinucleotide resolution. Our results furthermore support the conclusion that promoter methylation is sufficient to functionally silence gene expression.

Assuntos

Sistemas CRISPR-Cas/genética , Ctenóforos/genética , Edição de Genes/métodos , Genoma/genética , Animais , Sequência de Bases , Epigenoma/genética

19.

Integration of multiple epigenomic marks improves prediction of variant impact in saturation mutagenesis reporter assay.

Shigaki, Dustin; Adato, Orit; Adhikari, Aashish N; Dong, Shengcheng; Hawkins-Hooker, Alex; Inoue, Fumitaka; Juven-Gershon, Tamar; Kenlay, Henry; Martin, Beth; Patra, Ayoti; Penzar, Dmitry D; Schubach, Max; Xiong, Chenling; Yan, Zhongxia; Boyle, Alan P; Kreimer, Anat; Kulakovskiy, Ivan V; Reid, John; Unger, Ron; Yosef, Nir; Shendure, Jay; Ahituv, Nadav; Kircher, Martin; Beer, Michael A.

Hum Mutat ; 40(9): 1280-1291, 2019 09.

Artigo em Inglês | MEDLINE | ID: mdl-31106481

RESUMO

The integrative analysis of high-throughput reporter assays, machine learning, and profiles of epigenomic chromatin state in a broad array of cells and tissues has the potential to significantly improve our understanding of noncoding regulatory element function and its contribution to human disease. Here, we report results from the CAGI 5 regulation saturation challenge where participants were asked to predict the impact of nucleotide substitution at every base pair within five disease-associated human enhancers and nine disease-associated promoters. A library of mutations covering all bases was generated by saturation mutagenesis and altered activity was assessed in a massively parallel reporter assay (MPRA) in relevant cell lines. Reporter expression was measured relative to plasmid DNA to determine the impact of variants. The challenge was to predict the functional effects of variants on reporter expression. Comparative analysis of the full range of submitted prediction results identifies the most successful models of transcription factor binding sites, machine learning algorithms, and ways to choose among or incorporate diverse datatypes and cell-types for training computational models. These results have the potential to improve the design of future studies on more diverse sets of regulatory elements and aid the interpretation of disease-associated genetic variation.

Assuntos

DNA/química , Epigenômica/métodos , Mutação Puntual , Sítios de Ligação , Linhagem Celular , Cromatina/genética , DNA/metabolismo , Elementos Facilitadores Genéticos , Predisposição Genética para Doença , Humanos , Aprendizado de Máquina , Regiões Promotoras Genéticas , Fatores de Transcrição/metabolismo

20.

Mutations in the translocon-associated protein complex subunit SSR3 cause a novel congenital disorder of glycosylation.

Ng, Bobby G; Lourenço, Charles M; Losfeld, Marie-Estelle; Buckingham, Kati J; Kircher, Martin; Nickerson, Deborah A; Shendure, Jay; Bamshad, Michael J; Freeze, Hudson H.

J Inherit Metab Dis ; 42(5): 993-997, 2019 09.

Artigo em Inglês | MEDLINE | ID: mdl-30945312

RESUMO

The translocon-associated protein (TRAP) complex facilitates the translocation of proteins across the endoplasmic reticulum membrane and associates with the oligosaccharyl transferase (OST) complex to maintain proper glycosylation of nascent polypeptides. Pathogenic variants in either complex cause a group of rare genetic disorders termed, congenital disorders of glycosylation (CDG). We report an individual who presented with severe intellectual and developmental disabilities and sensorineural deafness with an unsolved type I CDG, and sought to identify the underlying genetic basis. Exome sequencing identified a novel homozygous variant c.278_281delAGGA [p.Glu93Valfs*7] in the signal sequence receptor 3 (SSR3) subunit of the TRAP complex. Biochemical studies in patient fibroblasts showed the variant destabilized the TRAP complex with a complete loss of SSR3 protein and partial loss of SSR1 and SSR4. Importantly, all subunit levels were corrected by expression of wild-type SSR3. Abnormal glycosylation status in fibroblasts was confirmed using two markers proteins, GP130 and ICAM1. Our findings confirm mutations in SSR3 cause a novel CDG. A novel frameshift variant in the translocon associated protein, SSR3, disrupts the stability of the TRAP complex and causes a novel Congenital Disorder of Glycosylation.

Assuntos

Proteínas de Ligação ao Cálcio/genética , Defeitos Congênitos da Glicosilação/genética , Deficiências do Desenvolvimento/etiologia , Glicoproteínas de Membrana/genética , Mutação , Receptores Citoplasmáticos e Nucleares/genética , Receptores de Peptídeos/genética , Pré-Escolar , Defeitos Congênitos da Glicosilação/patologia , Exoma , Glicosilação , Homozigoto , Humanos , Masculino

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

Detalhe da pesquisa