Pesquisa | Portal Regional da BVS

1.

Genomic data resources of the Brain Somatic Mosaicism Network for neuropsychiatric diseases.

Garrison, McKinzie A; Jang, Yeongjun; Bae, Taejeong; Cherskov, Adriana; Emery, Sarah B; Fasching, Liana; Jones, Attila; Moldovan, John B; Molitor, Cindy; Pochareddy, Sirisha; Peters, Mette A; Shin, Joo Heon; Wang, Yifan; Yang, Xiaoxu; Akbarian, Schahram; Chess, Andrew; Gage, Fred H; Gleeson, Joseph G; Kidd, Jeffrey M; McConnell, Michael; Mills, Ryan E; Moran, John V; Park, Peter J; Sestan, Nenad; Urban, Alexander E; Vaccarino, Flora M; Walsh, Christopher A; Weinberger, Daniel R; Wheelan, Sarah J; Abyzov, Alexej.

Sci Data ; 10(1): 813, 2023 11 20.

Artigo em Inglês | MEDLINE | ID: mdl-37985666

RESUMO

Somatic mosaicism is defined as an occurrence of two or more populations of cells having genomic sequences differing at given loci in an individual who is derived from a single zygote. It is a characteristic of multicellular organisms that plays a crucial role in normal development and disease. To study the nature and extent of somatic mosaicism in autism spectrum disorder, bipolar disorder, focal cortical dysplasia, schizophrenia, and Tourette syndrome, a multi-institutional consortium called the Brain Somatic Mosaicism Network (BSMN) was formed through the National Institute of Mental Health (NIMH). In addition to genomic data of affected and neurotypical brains, the BSMN also developed and validated a best practices somatic single nucleotide variant calling workflow through the analysis of reference brain tissue. These resources, which include >400 terabytes of data from 1087 subjects, are now available to the research community via the NIMH Data Archive (NDA) and are described here.

Assuntos

Transtornos Mentais , Humanos , Transtorno do Espectro Autista/genética , Encéfalo , Genômica , Mosaicismo , Genoma Humano , Transtornos Mentais/genética

2.

Candida albicans selection for human commensalism results in substantial within-host diversity without decreasing fitness for invasive disease.

Anderson, Faith M; Visser, Noelle D; Amses, Kevin R; Hodgins-Davis, Andrea; Weber, Alexandra M; Metzner, Katura M; McFadden, Michael J; Mills, Ryan E; O'Meara, Matthew J; James, Timothy Y; O'Meara, Teresa R.

PLoS Biol ; 21(5): e3001822, 2023 05.

Artigo em Inglês | MEDLINE | ID: mdl-37205709

RESUMO

Candida albicans is a frequent colonizer of human mucosal surfaces as well as an opportunistic pathogen. C. albicans is remarkably versatile in its ability to colonize diverse host sites with differences in oxygen and nutrient availability, pH, immune responses, and resident microbes, among other cues. It is unclear how the genetic background of a commensal colonizing population can influence the shift to pathogenicity. Therefore, we examined 910 commensal isolates from 35 healthy donors to identify host niche-specific adaptations. We demonstrate that healthy people are reservoirs for genotypically and phenotypically diverse C. albicans strains. Using limited diversity exploitation, we identified a single nucleotide change in the uncharacterized ZMS1 transcription factor that was sufficient to drive hyper invasion into agar. We found that SC5314 was significantly different from the majority of both commensal and bloodstream isolates in its ability to induce host cell death. However, our commensal strains retained the capacity to cause disease in the Galleria model of systemic infection, including outcompeting the SC5314 reference strain during systemic competition assays. This study provides a global view of commensal strain variation and within-host strain diversity of C. albicans and suggests that selection for commensalism in humans does not result in a fitness cost for invasive disease.

Assuntos

Candida albicans , Simbiose , Humanos , Candida albicans/genética , Fatores de Transcrição/genética , Regulação da Expressão Gênica

3.

Mapping the Complex Genetic Landscape of Human Neurons.

Sun, Chen; Kathuria, Kunal; Emery, Sarah B; Kim, ByungJun; Burbulis, Ian E; Shin, Joo Heon; Weinberger, Daniel R; Moran, John V; Kidd, Jeffrey M; Mills, Ryan E; McConnell, Michael J.

bioRxiv ; 2023 Mar 07.

Artigo em Inglês | MEDLINE | ID: mdl-36945473

RESUMO

When somatic cells acquire complex karyotypes, they are removed by the immune system. Mutant somatic cells that evade immune surveillance can lead to cancer. Neurons with complex karyotypes arise during neurotypical brain development, but neurons are almost never the origin of brain cancers. Instead, somatic mutations in neurons can bring about neurodevelopmental disorders, and contribute to the polygenic landscape of neuropsychiatric and neurodegenerative disease. A subset of human neurons harbors idiosyncratic copy number variants (CNVs, "CNV neurons"), but previous analyses of CNV neurons have been limited by relatively small sample sizes. Here, we developed an allele-based validation approach, SCOVAL, to corroborate or reject read-depth based CNV calls in single human neurons. We applied this approach to 2,125 frontal cortical neurons from a neurotypical human brain. This approach identified 226 CNV neurons, as well as a class of CNV neurons with complex karyotypes containing whole or substantial losses on multiple chromosomes. Moreover, we found that CNV location appears to be nonrandom. Recurrent regions of neuronal genome rearrangement contained fewer, but longer, genes.

4.

Somatic nuclear mitochondrial DNA insertions are prevalent in the human brain and accumulate over time in fibroblasts.

Zhou, Weichen; Karan, Kalpita R; Gu, Wenjin; Klein, Hans-Ulrich; Sturm, Gabriel; De Jager, Philip L; Bennett, David A; Hirano, Michio; Picard, Martin; Mills, Ryan E.

bioRxiv ; 2023 Apr 21.

Artigo em Inglês | MEDLINE | ID: mdl-36778249

RESUMO

The transfer of mitochondrial DNA into the nuclear genomes of eukaryotes (Numts) has been linked to lifespan in non-human species 1-3 and recently demonstrated to occur in rare instances from one human generation to the next 4. Here we investigated numtogenesis dynamics in humans in two ways. First, we quantified Numts in 1,187 post-mortem brain and blood samples from different individuals. Compared to circulating immune cells (n=389), post-mitotic brain tissue (n=798) contained more Numts, consistent with their potential somatic accumulation. Within brain samples we observed a 5.5-fold enrichment of somatic Numt insertions in the dorsolateral prefrontal cortex compared to cerebellum samples, suggesting that brain Numts arose spontaneously during development or across the lifespan. Moreover, more brain Numts was linked to earlier mortality. The brains of individuals with no cognitive impairment who died at younger ages carried approximately 2 more Numts per decade of life lost than those who lived longer. Second, we tested the dynamic transfer of Numts using a repeated-measures WGS design in a human fibroblast model that recapitulates several molecular hallmarks of aging 5. These longitudinal experiments revealed a gradual accumulation of one Numt every ~13 days. Numtogenesis was independent of large-scale genomic instability and unlikely driven cell clonality. Targeted pharmacological perturbations including chronic glucocorticoid signaling or impairing mitochondrial oxidative phosphorylation (OxPhos) only modestly increased the rate of numtogenesis, whereas patient-derived SURF1-mutant cells exhibiting mtDNA instability accumulated Numts 4.7-fold faster than healthy donors. Combined, our data document spontaneous numtogenesis in human cells and demonstrate an association between brain cortical somatic Numts and human lifespan. These findings open the possibility that mito-nuclear horizontal gene transfer among human post-mitotic tissues produce functionally-relevant human Numts over timescales shorter than previously assumed.

5.

Analysis of Human Papilloma Virus Content and Integration in Mucoepidermoid Carcinoma.

Gu, Wenjin; Bhangale, Apurva; Heft Neal, Molly E; Smith, Josh D; Brummel, Collin; McHugh, Jonathan B; Spector, Matthew E; Mills, Ryan E; Brenner, J Chad.

Viruses ; 14(11)2022 10 26.

Artigo em Inglês | MEDLINE | ID: mdl-36366450

RESUMO

Mucoepidermoid Carcinomas (MEC) represent the most common malignancies of salivary glands. Approximately 50% of all MEC cases are known to harbor CRTC1/3-MAML2 gene fusions, but the additional molecular drivers remain largely uncharacterized. Here, we sought to resolve controversy around the role of human papillomavirus (HPV) as a potential driver of mucoepidermoid carcinoma. Bioinformatics analysis was performed on 48 MEC transcriptomes. Subsequent targeted capture DNA sequencing was used to annotate HPV content and integration status in the host genome. HPV of any type was only identified in 1/48 (2%) of the MEC transcriptomes analyzed. Importantly, the one HPV16+ tumor expressed high levels of p16, had high expression of HPV16 oncogenes E6 and E7, and displayed a complex integration pattern that included breakpoints into 13 host genes including PIK3AP1, HIPI, OLFM4,SIRT1, ARAP2, TMEM161B-AS1, and EPS15L1 as well as 9 non-genic regions. In this cohort, HPV is a rare driver of MEC but may have a substantial etiologic role in cases that harbor the virus. Genetic mechanisms of host genome integration are similar to those observed in other head and neck cancers.

Assuntos

Alphapapillomavirus , Carcinoma Mucoepidermoide , Infecções por Papillomavirus , Humanos , Carcinoma Mucoepidermoide/genética , Carcinoma Mucoepidermoide/metabolismo , Carcinoma Mucoepidermoide/patologia , Proteínas de Ligação a DNA/genética , Papillomaviridae/genética , Transativadores/genética , Proteínas Nucleares/genética , Fatores de Transcrição/genética

6.

Early HPV ctDNA Kinetics and Imaging Biomarkers Predict Therapeutic Response in p16+ Oropharyngeal Squamous Cell Carcinoma.

Cao, Yue; Haring, Catherine T; Brummel, Collin; Bhambhani, Chandan; Aryal, Madhava; Lee, Choonik; Heft Neal, Molly; Bhangale, Apurva; Gu, Wenjin; Casper, Keith; Malloy, Kelly; Sun, Yilun; Shuman, Andrew; Prince, Mark E; Spector, Matthew E; Chinn, Steven; Shah, Jennifer; Schonewolf, Caitlin; McHugh, Jonathan B; Mills, Ryan E; Tewari, Muneesh; Worden, Francis P; Swiecicki, Paul L; Mierzwa, Michelle; Brenner, J Chad.

Clin Cancer Res ; 28(2): 350-359, 2022 01 15.

Artigo em Inglês | MEDLINE | ID: mdl-34702772

RESUMO

PURPOSE: In locally advanced p16+ oropharyngeal squamous cell carcinoma (OPSCC), (i) to investigate kinetics of human papillomavirus (HPV) circulating tumor DNA (ctDNA) and association with tumor progression after chemoradiation, and (ii) to compare the predictive value of ctDNA to imaging biomarkers of MRI and FDG-PET. EXPERIMENTAL DESIGN: Serial blood samples were collected from patients with AJCC8 stage III OPSCC (n = 34) enrolled on a randomized trial: pretreatment; during chemoradiation at weeks 2, 4, and 7; and posttreatment. All patients also had dynamic-contrast-enhanced and diffusion-weighted MRI, as well as FDG-PET scans pre-chemoradiation and week 2 during chemoradiation. ctDNA values were analyzed for prediction of freedom from progression (FFP), and correlations with aggressive tumor subvolumes with low blood volume (TVLBV) and low apparent diffusion coefficient (TVLADC), and metabolic tumor volume (MTV) using Cox proportional hazards model and Spearman rank correlation. RESULTS: Low pretreatment ctDNA and an early increase in ctDNA at week 2 compared with baseline were significantly associated with superior FFP (P < 0.02 and P < 0.05, respectively). At week 4 or 7, neither ctDNA counts nor clearance were significantly predictive of progression (P = 0.8). Pretreatment ctDNA values were significantly correlated with nodal TVLBV, TVLADC, and MTV pre-chemoradiation (P < 0.03), while the ctDNA values at week 2 were correlated with these imaging metrics in primary tumor. Multivariate analysis showed that ctDNA and the imaging metrics performed comparably to predict FFP. CONCLUSIONS: Early ctDNA kinetics during definitive chemoradiation may predict therapy response in stage III OPSCC.

Assuntos

Alphapapillomavirus , Carcinoma de Células Escamosas , DNA Tumoral Circulante , Neoplasias de Cabeça e Pescoço , Neoplasias Orofaríngeas , Infecções por Papillomavirus , Biomarcadores , Carcinoma de Células Escamosas/diagnóstico por imagem , Carcinoma de Células Escamosas/genética , Carcinoma de Células Escamosas/terapia , DNA Tumoral Circulante/genética , Fluordesoxiglucose F18 , Humanos , Cinética , Neoplasias Orofaríngeas/diagnóstico por imagem , Neoplasias Orofaríngeas/genética , Neoplasias Orofaríngeas/terapia , Papillomaviridae/genética , Infecções por Papillomavirus/complicações , Infecções por Papillomavirus/genética , Prognóstico , Estudos Retrospectivos , Carcinoma de Células Escamosas de Cabeça e Pescoço

7.

SquiggleNet: real-time, direct classification of nanopore signals.

Bao, Yuwei; Wadden, Jack; Erb-Downward, John R; Ranjan, Piyush; Zhou, Weichen; McDonald, Torrin L; Mills, Ryan E; Boyle, Alan P; Dickson, Robert P; Blaauw, David; Welch, Joshua D.

Genome Biol ; 22(1): 298, 2021 10 27.

Artigo em Inglês | MEDLINE | ID: mdl-34706748

RESUMO

We present SquiggleNet, the first deep-learning model that can classify nanopore reads directly from their electrical signals. SquiggleNet operates faster than DNA passes through the pore, allowing real-time classification and read ejection. Using 1 s of sequencing data, the classifier achieves significantly higher accuracy than base calling followed by sequence alignment. Our approach is also faster and requires an order of magnitude less memory than alignment-based approaches. SquiggleNet distinguished human from bacterial DNA with over 90% accuracy, generalized to unseen bacterial species in a human respiratory meta genome sample, and accurately classified sequences containing human long interspersed repeat elements.

Assuntos

Aprendizado Profundo , Sequenciamento por Nanoporos/métodos , DNA Bacteriano/análise , Humanos , Elementos Nucleotídeos Longos e Dispersos , Metagenoma , Sistema Respiratório/microbiologia

8.

SearcHPV: A novel approach to identify and assemble human papillomavirus-host genomic integration events in cancer.

Pinatti, Lisa M; Gu, Wenjin; Wang, Yifan; Elhossiny, Ahmed; Bhangale, Apurva D; Brummel, Collin V; Carey, Thomas E; Mills, Ryan E; Brenner, J Chad.

Cancer ; 127(19): 3531-3540, 2021 10 01.

Artigo em Inglês | MEDLINE | ID: mdl-34160069

RESUMO

BACKGROUND: Human papillomavirus (HPV) is a well-established driver of malignant transformation at a number of sites, including head and neck, cervical, vulvar, anorectal, and penile squamous cell carcinomas; however, the impact of HPV integration into the host human genome on this process remains largely unresolved. This is due to the technical challenge of identifying HPV integration sites, which includes limitations of existing informatics approaches to discovering viral-host breakpoints from low-read-coverage sequencing data. METHODS: To overcome this limitation, the authors developed SearcHPV, a new HPV detection pipeline based on targeted capture technology, and applied the algorithm to targeted capture data. They performed an integrated analysis of SearcHPV-defined breakpoints with genome-wide linked-read sequencing to identify potential HPV-related structural variations. RESULTS: Through an analysis of HPV+ models, the authors showed that SearcHPV detected HPV-host integration sites with a higher sensitivity and specificity than 2 other commonly used HPV detection callers. SearcHPV uncovered HPV integration sites adjacent to known cancer-related genes, including TP63, MYC, and TRAF2, and near regions of large structural variation. The authors further validated the junction contig assembly feature of SearcHPV, which helped to accurately identify viral-host junction breakpoint sequences. They found that viral integration occurred through a variety of DNA repair mechanisms, including nonhomologous end joining, alternative end joining, and microhomology-mediated repair. CONCLUSIONS: In summary, SearcHPV is a new optimized tool for the accurate detection of HPV-human integration sites from targeted capture DNA sequencing data.

Assuntos

Alphapapillomavirus , Carcinoma de Células Escamosas , Infecções por Papillomavirus , Neoplasias do Colo do Útero , Alphapapillomavirus/genética , DNA Viral/genética , Feminino , Genômica , Humanos , Papillomaviridae/genética , Infecções por Papillomavirus/complicações , Infecções por Papillomavirus/genética

9.

Cas9 targeted enrichment of mobile elements using nanopore sequencing.

McDonald, Torrin L; Zhou, Weichen; Castro, Christopher P; Mumm, Camille; Switzenberg, Jessica A; Mills, Ryan E; Boyle, Alan P.

Nat Commun ; 12(1): 3586, 2021 06 11.

Artigo em Inglês | MEDLINE | ID: mdl-34117247

RESUMO

Mobile element insertions (MEIs) are repetitive genomic sequences that contribute to genetic variation and can lead to genetic disorders. Targeted and whole-genome approaches using short-read sequencing have been developed to identify reference and non-reference MEIs; however, the read length hampers detection of these elements in complex genomic regions. Here, we pair Cas9-targeted nanopore sequencing with computational methodologies to capture active MEIs in human genomes. We demonstrate parallel enrichment for distinct classes of MEIs, averaging 44% of reads on-targeted signals and exhibiting a 13.4-54x enrichment over whole-genome approaches. We show an individual flow cell can recover most MEIs (97% L1Hs, 93% AluYb, 51% AluYa, 99% SVA_F, and 65% SVA_E). We identify seventeen non-reference MEIs in GM12878 overlooked by modern, long-read analysis pipelines, primarily in repetitive genomic regions. This work introduces the utility of nanopore sequencing for MEI enrichment and lays the foundation for rapid discovery of elusive, repetitive genetic elements.

Assuntos

Sistemas CRISPR-Cas , Genômica , Sequências Repetitivas Dispersas , Sequenciamento por Nanoporos/métodos , Linhagem Celular , Proteínas de Ligação a DNA , Genoma Humano , Humanos , Sequências Repetitivas de Ácido Nucleico , Ribonucleoproteínas/metabolismo , Análise de Sequência de DNA

10.

Expectations and blind spots for structural variation detection from long-read assemblies and short-read genome sequencing technologies.

Zhao, Xuefang; Collins, Ryan L; Lee, Wan-Ping; Weber, Alexandra M; Jun, Yukyung; Zhu, Qihui; Weisburd, Ben; Huang, Yongqing; Audano, Peter A; Wang, Harold; Walker, Mark; Lowther, Chelsea; Fu, Jack; Gerstein, Mark B; Devine, Scott E; Marschall, Tobias; Korbel, Jan O; Eichler, Evan E; Chaisson, Mark J P; Lee, Charles; Mills, Ryan E; Brand, Harrison; Talkowski, Michael E.

Am J Hum Genet ; 108(5): 919-928, 2021 05 06.

Artigo em Inglês | MEDLINE | ID: mdl-33789087

RESUMO

Virtually all genome sequencing efforts in national biobanks, complex and Mendelian disease programs, and medical genetic initiatives are reliant upon short-read whole-genome sequencing (srWGS), which presents challenges for the detection of structural variants (SVs) relative to emerging long-read WGS (lrWGS) technologies. Given this ubiquity of srWGS in large-scale genomics initiatives, we sought to establish expectations for routine SV detection from this data type by comparison with lrWGS assembly, as well as to quantify the genomic properties and added value of SVs uniquely accessible to each technology. Analyses from the Human Genome Structural Variation Consortium (HGSVC) of three families captured ~11,000 SVs per genome from srWGS and ~25,000 SVs per genome from lrWGS assembly. Detection power and precision for SV discovery varied dramatically by genomic context and variant class: 9.7% of the current GRCh38 reference is defined by segmental duplication (SD) and simple repeat (SR), yet 91.4% of deletions that were specifically discovered by lrWGS localized to these regions. Across the remaining 90.3% of reference sequence, we observed extremely high (93.8%) concordance between technologies for deletions in these datasets. In contrast, lrWGS was superior for detection of insertions across all genomic contexts. Given that non-SD/SR sequences encompass 95.9% of currently annotated disease-associated exons, improved sensitivity from lrWGS to discover novel pathogenic deletions in these currently interpretable genomic regions is likely to be incremental. However, these analyses highlight the considerable added value of assembly-based lrWGS to create new catalogs of insertions and transposable elements, as well as disease-associated repeat expansions in genomic sequences that were previously recalcitrant to routine assessment.

Assuntos

Genoma Humano/genética , Variação Estrutural do Genoma , Genômica/métodos , Objetivos , Sequenciamento Completo do Genoma/métodos , Sequenciamento Completo do Genoma/normas , Variações do Número de Cópias de DNA , Éxons/genética , Humanos , Projetos de Pesquisa , Duplicações Segmentares Genômicas , Alinhamento de Sequência

11.

Comprehensive identification of somatic nucleotide variants in human brain tissue.

Wang, Yifan; Bae, Taejeong; Thorpe, Jeremy; Sherman, Maxwell A; Jones, Attila G; Cho, Sean; Daily, Kenneth; Dou, Yanmei; Ganz, Javier; Galor, Alon; Lobon, Irene; Pattni, Reenal; Rosenbluh, Chaggai; Tomasi, Simone; Tomasini, Livia; Yang, Xiaoxu; Zhou, Bo; Akbarian, Schahram; Ball, Laurel L; Bizzotto, Sara; Emery, Sarah B; Doan, Ryan; Fasching, Liana; Jang, Yeongjun; Juan, David; Lizano, Esther; Luquette, Lovelace J; Moldovan, John B; Narurkar, Rujuta; Oetjens, Matthew T; Rodin, Rachel E; Sekar, Shobana; Shin, Joo Heon; Soriano, Eduardo; Straub, Richard E; Zhou, Weichen; Chess, Andrew; Gleeson, Joseph G; Marquès-Bonet, Tomas; Park, Peter J; Peters, Mette A; Pevsner, Jonathan; Walsh, Christopher A; Weinberger, Daniel R; Vaccarino, Flora M; Moran, John V; Urban, Alexander E; Kidd, Jeffrey M; Mills, Ryan E; Abyzov, Alexej.

Genome Biol ; 22(1): 92, 2021 03 29.

Artigo em Inglês | MEDLINE | ID: mdl-33781308

RESUMO

BACKGROUND: Post-zygotic mutations incurred during DNA replication, DNA repair, and other cellular processes lead to somatic mosaicism. Somatic mosaicism is an established cause of various diseases, including cancers. However, detecting mosaic variants in DNA from non-cancerous somatic tissues poses significant challenges, particularly if the variants only are present in a small fraction of cells. RESULTS: Here, the Brain Somatic Mosaicism Network conducts a coordinated, multi-institutional study to examine the ability of existing methods to detect simulated somatic single-nucleotide variants (SNVs) in DNA mixing experiments, generate multiple replicates of whole-genome sequencing data from the dorsolateral prefrontal cortex, other brain regions, dura mater, and dural fibroblasts of a single neurotypical individual, devise strategies to discover somatic SNVs, and apply various approaches to validate somatic SNVs. These efforts lead to the identification of 43 bona fide somatic SNVs that range in variant allele fractions from ~ 0.005 to ~ 0.28. Guided by these results, we devise best practices for calling mosaic SNVs from 250× whole-genome sequencing data in the accessible portion of the human genome that achieve 90% specificity and sensitivity. Finally, we demonstrate that analysis of multiple bulk DNA samples from a single individual allows the reconstruction of early developmental cell lineage trees. CONCLUSIONS: This study provides a unified set of best practices to detect somatic SNVs in non-cancerous tissues. The data and methods are freely available to the scientific community and should serve as a guide to assess the contributions of somatic SNVs to neuropsychiatric diseases.

Assuntos

Encéfalo/metabolismo , Estudos de Associação Genética , Variação Genética , Alelos , Mapeamento Cromossômico , Biologia Computacional/métodos , Estudos de Associação Genética/métodos , Genômica/métodos , Células Germinativas/metabolismo , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Especificidade de Órgãos/genética , Polimorfismo de Nucleotídeo Único

12.

Genome diversity in Ukraine.

Oleksyk, Taras K; Wolfsberger, Walter W; Weber, Alexandra M; Shchubelka, Khrystyna; Oleksyk, Olga T; Levchuk, Olga; Patrus, Alla; Lazar, Nelya; Castro-Marquez, Stephanie O; Hasynets, Yaroslava; Boldyzhar, Patricia; Neymet, Mikhailo; Urbanovych, Alina; Stakhovska, Viktoriya; Malyar, Kateryna; Chervyakova, Svitlana; Podoroha, Olena; Kovalchuk, Natalia; Rodriguez-Flores, Juan L; Zhou, Weichen; Medley, Sarah; Battistuzzi, Fabia; Liu, Ryan; Hou, Yong; Chen, Siru; Yang, Huanming; Yeager, Meredith; Dean, Michael; Mills, Ryan E; Smolanka, Volodymyr.

Gigascience ; 10(1)2021 01 13.

Artigo em Inglês | MEDLINE | ID: mdl-33438729

RESUMO

BACKGROUND: The main goal of this collaborative effort is to provide genome-wide data for the previously underrepresented population in Eastern Europe, and to provide cross-validation of the data from genome sequences and genotypes of the same individuals acquired by different technologies. We collected 97 genome-grade DNA samples from consented individuals representing major regions of Ukraine that were consented for public data release. BGISEQ-500 sequence data and genotypes by an Illumina GWAS chip were cross-validated on multiple samples and additionally referenced to 1 sample that has been resequenced by Illumina NovaSeq6000 S4 at high coverage. RESULTS: The genome data have been searched for genomic variation represented in this population, and a number of variants have been reported: large structural variants, indels, copy number variations, single-nucletide polymorphisms, and microsatellites. To our knowledge, this study provides the largest to-date survey of genetic variation in Ukraine, creating a public reference resource aiming to provide data for medical research in a large understudied population. CONCLUSIONS: Our results indicate that the genetic diversity of the Ukrainian population is uniquely shaped by evolutionary and demographic forces and cannot be ignored in future genetic and biomedical studies. These data will contribute a wealth of new information bringing forth a wealth of novel, endemic and medically related alleles.

Assuntos

Variações do Número de Cópias de DNA , Polimorfismo de Nucleotídeo Único , Genoma , Genômica , Humanos , Ucrânia

13.

Association of CNVs with methylation variation.

Shi, Xinghua; Radhakrishnan, Saranya; Wen, Jia; Chen, Jin Yun; Chen, Junjie; Lam, Brianna Ashlyn; Mills, Ryan E; Stranger, Barbara E; Lee, Charles; Setlur, Sunita R.

NPJ Genom Med ; 5: 41, 2020.

Artigo em Inglês | MEDLINE | ID: mdl-33062306

RESUMO

Germline copy number variants (CNVs) and single-nucleotide polymorphisms (SNPs) form the basis of inter-individual genetic variation. Although the phenotypic effects of SNPs have been extensively investigated, the effects of CNVs is relatively less understood. To better characterize mechanisms by which CNVs affect cellular phenotype, we tested their association with variable CpG methylation in a genome-wide manner. Using paired CNV and methylation data from the 1000 genomes and HapMap projects, we identified genome-wide associations by methylation quantitative trait locus (mQTL) analysis. We found individual CNVs being associated with methylation of multiple CpGs and vice versa. CNV-associated methylation changes were correlated with gene expression. CNV-mQTLs were enriched for regulatory regions, transcription factor-binding sites (TFBSs), and were involved in long-range physical interactions with associated CpGs. Some CNV-mQTLs were associated with methylation of imprinted genes. Several CNV-mQTLs and/or associated genes were among those previously reported by genome-wide association studies (GWASs). We demonstrate that germline CNVs in the genome are associated with CpG methylation. Our findings suggest that structural variation together with methylation may affect cellular phenotype.

14.

Author Correction: A robust benchmark for detection of germline large deletions and insertions.

Zook, Justin M; Hansen, Nancy F; Olson, Nathan D; Chapman, Lesley; Mullikin, James C; Xiao, Chunlin; Sherry, Stephen; Koren, Sergey; Phillippy, Adam M; Boutros, Paul C; Sahraeian, Sayed Mohammad E; Huang, Vincent; Rouette, Alexandre; Alexander, Noah; Mason, Christopher E; Hajirasouliha, Iman; Ricketts, Camir; Lee, Joyce; Tearle, Rick; Fiddes, Ian T; Barrio, Alvaro Martinez; Wala, Jeremiah; Carroll, Andrew; Ghaffari, Noushin; Rodriguez, Oscar L; Bashir, Ali; Jackman, Shaun; Farrell, John J; Wenger, Aaron M; Alkan, Can; Soylev, Arda; Schatz, Michael C; Garg, Shilpa; Church, George; Marschall, Tobias; Chen, Ken; Fan, Xian; English, Adam C; Rosenfeld, Jeffrey A; Zhou, Weichen; Mills, Ryan E; Sage, Jay M; Davis, Jennifer R; Kaiser, Michael D; Oliver, John S; Catalano, Anthony P; Chaisson, Mark J P; Spies, Noah; Sedlazeck, Fritz J; Salit, Marc.

Nat Biotechnol ; 38(11): 1357, 2020 Nov.

Artigo em Inglês | MEDLINE | ID: mdl-32699374

RESUMO

An amendment to this paper has been published and can be accessed via a link at the top of the paper.

15.

A robust benchmark for detection of germline large deletions and insertions.

Zook, Justin M; Hansen, Nancy F; Olson, Nathan D; Chapman, Lesley; Mullikin, James C; Xiao, Chunlin; Sherry, Stephen; Koren, Sergey; Phillippy, Adam M; Boutros, Paul C; Sahraeian, Sayed Mohammad E; Huang, Vincent; Rouette, Alexandre; Alexander, Noah; Mason, Christopher E; Hajirasouliha, Iman; Ricketts, Camir; Lee, Joyce; Tearle, Rick; Fiddes, Ian T; Barrio, Alvaro Martinez; Wala, Jeremiah; Carroll, Andrew; Ghaffari, Noushin; Rodriguez, Oscar L; Bashir, Ali; Jackman, Shaun; Farrell, John J; Wenger, Aaron M; Alkan, Can; Soylev, Arda; Schatz, Michael C; Garg, Shilpa; Church, George; Marschall, Tobias; Chen, Ken; Fan, Xian; English, Adam C; Rosenfeld, Jeffrey A; Zhou, Weichen; Mills, Ryan E; Sage, Jay M; Davis, Jennifer R; Kaiser, Michael D; Oliver, John S; Catalano, Anthony P; Chaisson, Mark J P; Spies, Noah; Sedlazeck, Fritz J; Salit, Marc.

Nat Biotechnol ; 38(11): 1347-1355, 2020 11.

Artigo em Inglês | MEDLINE | ID: mdl-32541955

RESUMO

New technologies and analysis methods are enabling genomic structural variants (SVs) to be detected with ever-increasing accuracy, resolution and comprehensiveness. To help translate these methods to routine research and clinical practice, we developed a sequence-resolved benchmark set for identification of both false-negative and false-positive germline large insertions and deletions. To create this benchmark for a broadly consented son in a Personal Genome Project trio with broadly available cells and DNA, the Genome in a Bottle Consortium integrated 19 sequence-resolved variant calling methods from diverse technologies. The final benchmark set contains 12,745 isolated, sequence-resolved insertion (7,281) and deletion (5,464) calls ≥50 base pairs (bp). The Tier 1 benchmark regions, for which any extra calls are putative false positives, cover 2.51 Gbp and 5,262 insertions and 4,095 deletions supported by ≥1 diploid assembly. We demonstrate that the benchmark set reliably identifies false negatives and false positives in high-quality SV callsets from short-, linked- and long-read sequencing and optical mapping.

Assuntos

Mutação em Linhagem Germinativa/genética , Mutação INDEL/genética , Diploide , Variação Estrutural do Genoma , Humanos , Anotação de Sequência Molecular , Análise de Sequência de DNA

16.

Characterization of nuclear mitochondrial insertions in the whole genomes of primates.

Dayama, Gargi; Zhou, Weichen; Prado-Martinez, Javier; Marques-Bonet, Tomas; Mills, Ryan E.

NAR Genom Bioinform ; 2(4): lqaa089, 2020 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-33575633

RESUMO

The transfer and integration of whole and partial mitochondrial genomes into the nuclear genomes of eukaryotes is an ongoing process that has facilitated the transfer of genes and contributed to the evolution of various cellular pathways. Many previous studies have explored the impact of these insertions, referred to as NumtS, but have focused primarily on older events that have become fixed and are therefore present in all individual genomes for a given species. We previously developed an approach to identify novel Numt polymorphisms from next-generation sequence data and applied it to thousands of human genomes. Here, we extend this analysis to 79 individuals of other great ape species including chimpanzee, bonobo, gorilla, orang-utan and also an old world monkey, macaque. We show that recent Numt insertions are prevalent in each species though at different apparent rates, with chimpanzees exhibiting a significant increase in both polymorphic and fixed Numt sequences as compared to other great apes. We further assessed positional effects in each species in terms of evolutionary time and rate of insertion and identified putative hotspots on chromosome 5 for Numt integration, providing insight into both recent polymorphic and older fixed reference NumtS in great apes in comparison to human events.

17.

Identification and characterization of occult human-specific LINE-1 insertions using long-read sequencing technology.

Zhou, Weichen; Emery, Sarah B; Flasch, Diane A; Wang, Yifan; Kwan, Kenneth Y; Kidd, Jeffrey M; Moran, John V; Mills, Ryan E.

Nucleic Acids Res ; 48(3): 1146-1163, 2020 02 20.

Artigo em Inglês | MEDLINE | ID: mdl-31853540

RESUMO

Long Interspersed Element-1 (LINE-1) retrotransposition contributes to inter- and intra-individual genetic variation and occasionally can lead to human genetic disorders. Various strategies have been developed to identify human-specific LINE-1 (L1Hs) insertions from short-read whole genome sequencing (WGS) data; however, they have limitations in detecting insertions in complex repetitive genomic regions. Here, we developed a computational tool (PALMER) and used it to identify 203 non-reference L1Hs insertions in the NA12878 benchmark genome. Using PacBio long-read sequencing data, we identified L1Hs insertions that were absent in previous short-read studies (90/203). Approximately 81% (73/90) of the L1Hs insertions reside within endogenous LINE-1 sequences in the reference assembly and the analysis of unique breakpoint junction sequences revealed 63% (57/90) of these L1Hs insertions could be genotyped in 1000 Genomes Project sequences. Moreover, we observed that amplification biases encountered in single-cell WGS experiments led to a wide variation in L1Hs insertion detection rates between four individual NA12878 cells; under-amplification limited detection to 32% (65/203) of insertions, whereas over-amplification increased false positive calls. In sum, these data indicate that L1Hs insertions are often missed using standard short-read sequencing approaches and long-read sequencing approaches can significantly improve the detection of L1Hs insertions present in individual genomes.

Assuntos

Elementos Nucleotídeos Longos e Dispersos , Análise de Sequência de DNA/métodos , Linhagem Celular , Genoma Humano , Humanos , Polimorfismo Genético , Análise de Célula Única , Software , Sequenciamento Completo do Genoma

18.

Structural variation in the sequencing era.

Ho, Steve S; Urban, Alexander E; Mills, Ryan E.

Nat Rev Genet ; 21(3): 171-189, 2020 03.

Artigo em Inglês | MEDLINE | ID: mdl-31729472

RESUMO

Identifying structural variation (SV) is essential for genome interpretation but has been historically difficult due to limitations inherent to available genome technologies. Detection methods that use ensemble algorithms and emerging sequencing technologies have enabled the discovery of thousands of SVs, uncovering information about their ubiquity, relationship to disease and possible effects on biological mechanisms. Given the variability in SV type and size, along with unique detection biases of emerging genomic platforms, multiplatform discovery is necessary to resolve the full spectrum of variation. Here, we review modern approaches for investigating SVs and proffer that, moving forwards, studies integrating biological information with detection will be necessary to comprehensively understand the impact of SV in the human genome.

Assuntos

Variação Estrutural do Genoma , Análise de Sequência/métodos , Algoritmos , Genoma Humano , Humanos

19.

Prognostic model for multiple myeloma progression integrating gene expression and clinical features.

Sun, Chen; Li, Hongyang; Mills, Ryan E; Guan, Yuanfang.

Gigascience ; 8(12)2019 12 01.

Artigo em Inglês | MEDLINE | ID: mdl-31886876

RESUMO

BACKGROUND: Multiple myeloma (MM) is a hematological cancer caused by abnormal accumulation of monoclonal plasma cells in bone marrow. With the increase in treatment options, risk-adapted therapy is becoming more and more important. Survival analysis is commonly applied to study progression or other events of interest and stratify the risk of patients. RESULTS: In this study, we present the current state-of-the-art model for MM prognosis and the molecular biomarker set for stratification: the winning algorithm in the 2017 Multiple Myeloma DREAM Challenge, Sub-Challenge 3. Specifically, we built a non-parametric complete hazard ranking model to map the right-censored data into a linear space, where commonplace machine learning techniques, such as Gaussian process regression and random forests, can play their roles. Our model integrated both the gene expression profile and clinical features to predict the progression of MM. Compared with conventional models, such as Cox model and random survival forests, our model achieved higher accuracy in 3 within-cohort predictions. In addition, it showed robust predictive power in cross-cohort validations. Key molecular signatures related to MM progression were identified from our model, which may function as the core determinants of MM progression and provide important guidance for future research and clinical practice. Functional enrichment analysis and mammalian gene-gene interaction network revealed crucial biological processes and pathways involved in MM progression. The model is dockerized and publicly available at https://www.synapse.org/#!Synapse:syn11459638. Both data and reproducible code are included in the docker. CONCLUSIONS: We present the current state-of-the-art prognostic model for MM integrating gene expression and clinical features validated in an independent test set.

Assuntos

Perfilação da Expressão Gênica/métodos , Redes Reguladoras de Genes , Mieloma Múltiplo/genética , Mieloma Múltiplo/mortalidade , Idoso , Algoritmos , Estudos de Coortes , Progressão da Doença , Feminino , Regulação Neoplásica da Expressão Gênica , Humanos , Aprendizado de Máquina , Masculino , Pessoa de Meia-Idade , Modelos Estatísticos , Prognóstico , Análise de Sobrevida

20.

RNA ligation precedes the retrotransposition of U6/LINE-1 chimeric RNA.

Moldovan, John B; Wang, Yifan; Shuman, Stewart; Mills, Ryan E; Moran, John V.

Proc Natl Acad Sci U S A ; 116(41): 20612-20622, 2019 10 08.

Artigo em Inglês | MEDLINE | ID: mdl-31548405

RESUMO

Long interspersed element-1 (LINE-1 or L1) amplifies via retrotransposition. Active L1s encode 2 proteins (ORF1p and ORF2p) that bind their encoding transcript to promote retrotransposition in cis The L1-encoded proteins also promote the retrotransposition of small-interspersed element RNAs, noncoding RNAs, and messenger RNAs in trans Some L1-mediated retrotransposition events consist of a copy of U6 RNA conjoined to a variably 5'-truncated L1, but how U6/L1 chimeras are formed requires elucidation. Here, we report the following: The RNA ligase RtcB can join U6 RNAs ending in a 2',3'-cyclic phosphate to L1 RNAs containing a 5'-OH in vitro; depletion of endogenous RtcB in HeLa cell extracts reduces U6/L1 RNA ligation efficiency; retrotransposition of U6/L1 RNAs leads to U6/L1 pseudogene formation; and a unique cohort of U6/L1 chimeric RNAs are present in multiple human cell lines. Thus, these data suggest that U6 small nuclear RNA (snRNA) and RtcB participate in the formation of chimeric RNAs and that retrotransposition of chimeric RNA contributes to interindividual genetic variation.

Assuntos

Células-Tronco Embrionárias/metabolismo , Elementos Nucleotídeos Longos e Dispersos/genética , Neoplasias/genética , Células-Tronco Neurais/metabolismo , RNA Nuclear Pequeno/genética , RNA/genética , Retroelementos/genética , Células HeLa , Humanos , Pseudogenes , RNA/química , RNA Mensageiro/química , RNA Mensageiro/genética , RNA Nuclear Pequeno/química

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA