Pesquisa | BVS Doenças Infecciosas e Parasitárias

1.

Major Impacts of Widespread Structural Variation on Gene Expression and Crop Improvement in Tomato.

Alonge, Michael; Wang, Xingang; Benoit, Matthias; Soyk, Sebastian; Pereira, Lara; Zhang, Lei; Suresh, Hamsini; Ramakrishnan, Srividya; Maumus, Florian; Ciren, Danielle; Levy, Yuval; Harel, Tom Hai; Shalev-Schlosser, Gili; Amsellem, Ziva; Razifard, Hamid; Caicedo, Ana L; Tieman, Denise M; Klee, Harry; Kirsche, Melanie; Aganezov, Sergey; Ranallo-Benavidez, T Rhyker; Lemmon, Zachary H; Kim, Jennifer; Robitaille, Gina; Kramer, Melissa; Goodwin, Sara; McCombie, W Richard; Hutton, Samuel; Van Eck, Joyce; Gillis, Jesse; Eshed, Yuval; Sedlazeck, Fritz J; van der Knaap, Esther; Schatz, Michael C; Lippman, Zachary B.

Cell ; 182(1): 145-161.e23, 2020 07 09.

Artigo em Inglês | MEDLINE | ID: mdl-32553272

RESUMO

Structural variants (SVs) underlie important crop improvement and domestication traits. However, resolving the extent, diversity, and quantitative impact of SVs has been challenging. We used long-read nanopore sequencing to capture 238,490 SVs in 100 diverse tomato lines. This panSV genome, along with 14 new reference assemblies, revealed large-scale intermixing of diverse genotypes, as well as thousands of SVs intersecting genes and cis-regulatory regions. Hundreds of SV-gene pairs exhibit subtle and significant expression changes, which could broadly influence quantitative trait variation. By combining quantitative genetics with genome editing, we show how multiple SVs that changed gene dosage and expression levels modified fruit flavor, size, and production. In the last example, higher order epistasis among four SVs affecting three related transcription factors allowed introduction of an important harvesting trait in modern tomato. Our findings highlight the underexplored role of SVs in genotype-to-phenotype relationships and their widespread importance and utility in crop improvement.

Assuntos

Produtos Agrícolas/genética , Regulação da Expressão Gênica de Plantas , Variação Estrutural do Genoma , Solanum lycopersicum/genética , Alelos , Sistema Enzimático do Citocromo P-450/genética , Ecótipo , Epistasia Genética , Frutas/genética , Duplicação Gênica , Genoma de Planta , Genótipo , Endogamia , Anotação de Sequência Molecular , Fenótipo , Melhoramento Vegetal , Locos de Características Quantitativas/genética

2.

High-coverage nanopore sequencing of samples from the 1000 Genomes Project to build a comprehensive catalog of human genetic variation.

Gustafson, Jonas A; Gibson, Sophia B; Damaraju, Nikhita; Zalusky, Miranda P G; Hoekzema, Kendra; Twesigomwe, David; Yang, Lei; Snead, Anthony A; Richmond, Phillip A; De Coster, Wouter; Olson, Nathan D; Guarracino, Andrea; Li, Qiuhui; Miller, Angela L; Goffena, Joy; Anderson, Zachary B; Storz, Sophie H R; Ward, Sydney A; Sinha, Maisha; Gonzaga-Jauregui, Claudia; Clarke, Wayne E; Basile, Anna O; Corvelo, André; Reeves, Catherine; Helland, Adrienne; Musunuri, Rajeeva Lochan; Revsine, Mahler; Patterson, Karynne E; Paschal, Cate R; Zakarian, Christina; Goodwin, Sara; Jensen, Tanner D; Robb, Esther; McCombie, W Richard; Sedlazeck, Fritz J; Zook, Justin M; Montgomery, Stephen B; Garrison, Erik; Kolmogorov, Mikhail; Schatz, Michael C; McLaughlin, Richard N; Dashnow, Harriet; Zody, Michael C; Loose, Matt; Jain, Miten; Eichler, Evan E; Miller, Danny E.

Genome Res ; 2024 Oct 30.

Artigo em Inglês | MEDLINE | ID: mdl-39358015

RESUMO

Fewer than half of individuals with a suspected Mendelian or monogenic condition receive a precise molecular diagnosis after comprehensive clinical genetic testing. Improvements in data quality and costs have heightened interest in using long-read sequencing (LRS) to streamline clinical genomic testing, but the absence of control data sets for variant filtering and prioritization has made tertiary analysis of LRS data challenging. To address this, the 1000 Genomes Project (1KGP) Oxford Nanopore Technologies Sequencing Consortium aims to generate LRS data from at least 800 of the 1KGP samples. Our goal is to use LRS to identify a broader spectrum of variation so we may improve our understanding of normal patterns of human variation. Here, we present data from analysis of the first 100 samples, representing all 5 superpopulations and 19 subpopulations. These samples, sequenced to an average depth of coverage of 37× and sequence read N50 of 54 kbp, have high concordance with previous studies for identifying single nucleotide and indel variants outside of homopolymer regions. Using multiple structural variant (SV) callers, we identify an average of 24,543 high-confidence SVs per genome, including shared and private SVs likely to disrupt gene function as well as pathogenic expansions within disease-associated repeats that were not detected using short reads. Evaluation of methylation signatures revealed expected patterns at known imprinted loci, samples with skewed X-inactivation patterns, and novel differentially methylated regions. All raw sequencing data, processed data, and summary statistics are publicly available, providing a valuable resource for the clinical genetics community to discover pathogenic SVs.

3.

Familial long-read sequencing increases yield of de novo mutations.

Noyes, Michelle D; Harvey, William T; Porubsky, David; Sulovari, Arvis; Li, Ruiyang; Rose, Nicholas R; Audano, Peter A; Munson, Katherine M; Lewis, Alexandra P; Hoekzema, Kendra; Mantere, Tuomo; Graves-Lindsay, Tina A; Sanders, Ashley D; Goodwin, Sara; Kramer, Melissa; Mokrab, Younes; Zody, Michael C; Hoischen, Alexander; Korbel, Jan O; McCombie, W Richard; Eichler, Evan E.

Am J Hum Genet ; 109(4): 631-646, 2022 04 07.

Artigo em Inglês | MEDLINE | ID: mdl-35290762

RESUMO

Studies of de novo mutation (DNM) have typically excluded some of the most repetitive and complex regions of the genome because these regions cannot be unambiguously mapped with short-read sequencing data. To better understand the genome-wide pattern of DNM, we generated long-read sequence data from an autism parent-child quad with an affected female where no pathogenic variant had been discovered in short-read Illumina sequence data. We deeply sequenced all four individuals by using three sequencing platforms (Illumina, Oxford Nanopore, and Pacific Biosciences) and three complementary technologies (Strand-seq, optical mapping, and 10X Genomics). Using long-read sequencing, we initially discovered and validated 171 DNMs across two children-a 20% increase in the number of de novo single-nucleotide variants (SNVs) and indels when compared to short-read callsets. The number of DNMs further increased by 5% when considering a more complete human reference (T2T-CHM13) because of the recovery of events in regions absent from GRCh38 (e.g., three DNMs in heterochromatic satellites). In total, we validated 195 de novo germline mutations and 23 potential post-zygotic mosaic mutations across both children; the overall true substitution rate based on this integrated callset is at least 1.41 × 10-8 substitutions per nucleotide per generation. We also identified six de novo insertions and deletions in tandem repeats, two of which represent structural variants. We demonstrate that long-read sequencing and assembly, especially when combined with a more complete reference genome, increases the number of DNMs by >25% compared to previous studies, providing a more complete catalog of DNM compared to short-read data alone.

Assuntos

Genômica , Sequenciamento de Nucleotídeos em Larga Escala , Feminino , Humanos , Mutação/genética , Nucleotídeos , Análise de Sequência de DNA , Software

4.

Comprehensive analysis of structural variants in breast cancer genomes using single-molecule sequencing.

Aganezov, Sergey; Goodwin, Sara; Sherman, Rachel M; Sedlazeck, Fritz J; Arun, Gayatri; Bhatia, Sonam; Lee, Isac; Kirsche, Melanie; Wappel, Robert; Kramer, Melissa; Kostroff, Karen; Spector, David L; Timp, Winston; McCombie, W Richard; Schatz, Michael C.

Genome Res ; 30(9): 1258-1273, 2020 09.

Artigo em Inglês | MEDLINE | ID: mdl-32887686

RESUMO

Improved identification of structural variants (SVs) in cancer can lead to more targeted and effective treatment options as well as advance our basic understanding of the disease and its progression. We performed whole-genome sequencing of the SKBR3 breast cancer cell line and patient-derived tumor and normal organoids from two breast cancer patients using Illumina/10x Genomics, Pacific Biosciences (PacBio), and Oxford Nanopore Technologies (ONT) sequencing. We then inferred SVs and large-scale allele-specific copy number variants (CNVs) using an ensemble of methods. Our findings show that long-read sequencing allows for substantially more accurate and sensitive SV detection, with between 90% and 95% of variants supported by each long-read technology also supported by the other. We also report high accuracy for long reads even at relatively low coverage (25×-30×). Furthermore, we integrated SV and CNV data into a unifying karyotype-graph structure to present a more accurate representation of the mutated cancer genomes. We find hundreds of variants within known cancer-related genes detectable only through long-read sequencing. These findings highlight the need for long-read sequencing of cancer genomes for the precise analysis of their genetic instability.

Assuntos

Neoplasias da Mama/genética , Variação Estrutural do Genoma , Sequenciamento Completo do Genoma/métodos , Linhagem Celular Tumoral , Variações do Número de Cópias de DNA , Metilação de DNA , DNA de Neoplasias , Feminino , Humanos , Nanoporos , Organoides , RNA-Seq

5.

High resolution copy number inference in cancer using short-molecule nanopore sequencing.

Baslan, Timour; Kovaka, Sam; Sedlazeck, Fritz J; Zhang, Yanming; Wappel, Robert; Tian, Sha; Lowe, Scott W; Goodwin, Sara; Schatz, Michael C.

Nucleic Acids Res ; 49(21): e124, 2021 12 02.

Artigo em Inglês | MEDLINE | ID: mdl-34551429

RESUMO

Genome copy number is an important source of genetic variation in health and disease. In cancer, Copy Number Alterations (CNAs) can be inferred from short-read sequencing data, enabling genomics-based precision oncology. Emerging Nanopore sequencing technologies offer the potential for broader clinical utility, for example in smaller hospitals, due to lower instrument cost, higher portability, and ease of use. Nonetheless, Nanopore sequencing devices are limited in the number of retrievable sequencing reads/molecules compared to short-read sequencing platforms, limiting CNA inference accuracy. To address this limitation, we targeted the sequencing of short-length DNA molecules loaded at optimized concentration in an effort to increase sequence read/molecule yield from a single nanopore run. We show that short-molecule nanopore sequencing reproducibly returns high read counts and allows high quality CNA inference. We demonstrate the clinical relevance of this approach by accurately inferring CNAs in acute myeloid leukemia samples. The data shows that, compared to traditional approaches such as chromosome analysis/cytogenetics, short molecule nanopore sequencing returns more sensitive, accurate copy number information in a cost effective and expeditious manner, including for multiplex samples. Our results provide a framework for short-molecule nanopore sequencing with applications in research and medicine, which includes but is not limited to, CNAs.

Assuntos

Variações do Número de Cópias de DNA , DNA/análise , Oncologia/métodos , Sequenciamento por Nanoporos/métodos , Neoplasias/genética , Linhagem Celular Tumoral , Humanos

6.

Oral famotidine versus placebo in non-hospitalised patients with COVID-19: a randomised, double-blind, data-intense, phase 2 clinical trial.

Brennan, Christina M; Nadella, Sandeep; Zhao, Xiang; Dima, Richard J; Jordan-Martin, Nicole; Demestichas, Breanna R; Kleeman, Sam O; Ferrer, Miriam; von Gablenz, Eva Carlotta; Mourikis, Nicholas; Rubin, Michael E; Adnani, Harsha; Lee, Hassal; Ha, Taehoon; Prum, Soma; Schleicher, Cheryl B; Fox, Sharon S; Ryan, Michael G; Pili, Christina; Goldberg, Gary; Crawford, James M; Goodwin, Sara; Zhang, Xiaoyue; Preall, Jonathan B; Costa, Ana S H; Conigliaro, Joseph; Masci, Joseph R; Yang, Jie; Tuveson, David A; Tracey, Kevin J; Janowitz, Tobias.

Gut ; 71(5): 879-888, 2022 05.

Artigo em Inglês | MEDLINE | ID: mdl-35144974

RESUMO

OBJECTIVE: We assessed whether famotidine improved inflammation and symptomatic recovery in outpatients with mild to moderate COVID-19. DESIGN: Randomised, double-blind, placebo-controlled, fully remote, phase 2 clinical trial (NCT04724720) enrolling symptomatic unvaccinated adult outpatients with confirmed COVID-19 between January 2021 and April 2021 from two US centres. Patients self-administered 80 mg famotidine (n=28) or placebo (n=27) orally three times a day for 14 consecutive days. Endpoints were time to (primary) or rate of (secondary) symptom resolution, and resolution of inflammation (exploratory). RESULTS: Of 55 patients in the intention-to-treat group (median age 35 years (IQR: 20); 35 women (64%); 18 African American (33%); 14 Hispanic (26%)), 52 (95%) completed the trial, submitting 1358 electronic symptom surveys. Time to symptom resolution was not statistically improved (p=0.4). Rate of symptom resolution was improved for patients taking famotidine (p<0.0001). Estimated 50% reduction of overall baseline symptom scores were achieved at 8.2 days (95% CI: 7 to 9.8 days) for famotidine and 11.4 days (95% CI: 10.3 to 12.6 days) for placebo treated patients. Differences were independent of patient sex, race or ethnicity. Five self-limiting adverse events occurred (famotidine, n=2 (40%); placebo, n=3 (60%)). On day 7, fewer patients on famotidine had detectable interferon alpha plasma levels (p=0.04). Plasma immunoglobulin type G levels to SARS-CoV-2 nucleocapsid core protein were similar between both arms. CONCLUSIONS: Famotidine was safe and well tolerated in outpatients with mild to moderate COVID-19. Famotidine led to earlier resolution of symptoms and inflammation without reducing anti-SARS-CoV-2 immunity. Additional randomised trials are required.

Assuntos

Tratamento Farmacológico da COVID-19 , Famotidina , Adulto , Método Duplo-Cego , Famotidina/uso terapêutico , Feminino , Humanos , Inflamação , SARS-CoV-2 , Resultado do Tratamento

7.

Coming of age: ten years of next-generation sequencing technologies.

Goodwin, Sara; McPherson, John D; McCombie, W Richard.

Nat Rev Genet ; 17(6): 333-51, 2016 05 17.

Artigo em Inglês | MEDLINE | ID: mdl-27184599

RESUMO

Since the completion of the human genome project in 2003, extraordinary progress has been made in genome sequencing technologies, which has led to a decreased cost per megabase and an increase in the number and diversity of sequenced genomes. An astonishing complexity of genome architecture has been revealed, bringing these sequencing technologies to even greater advancements. Some approaches maximize the number of bases sequenced in the least amount of time, generating a wealth of data that can be used to understand increasingly complex phenotypes. Alternatively, other approaches now aim to sequence longer contiguous pieces of DNA, which are essential for resolving structurally complex regions. These and other strategies are providing researchers and clinicians a variety of tools to probe genomes in greater depth, leading to an enhanced understanding of how genome sequence variants underlie phenotype and disease.

Assuntos

Variação Genética/genética , Genoma Humano , Genômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Fenótipo

8.

A comparative transcriptional landscape of maize and sorghum obtained by single-molecule sequencing.

Wang, Bo; Regulski, Michael; Tseng, Elizabeth; Olson, Andrew; Goodwin, Sara; McCombie, W Richard; Ware, Doreen.

Genome Res ; 28(6): 921-932, 2018 06.

Artigo em Inglês | MEDLINE | ID: mdl-29712755

RESUMO

Maize and sorghum are both important crops with similar overall plant architectures, but they have key differences, especially in regard to their inflorescences. To better understand these two organisms at the molecular level, we compared expression profiles of both protein-coding and noncoding transcripts in 11 matched tissues using single-molecule, long-read, deep RNA sequencing. This comparative analysis revealed large numbers of novel isoforms in both species. Evolutionarily young genes were likely to be generated in reproductive tissues and usually had fewer isoforms than old genes. We also observed similarities and differences in alternative splicing patterns and activities, both among tissues and between species. The maize subgenomes exhibited no bias in isoform generation; however, genes in the B genome were more highly expressed in pollen tissue, whereas genes in the A genome were more highly expressed in endosperm. We also identified a number of splicing events conserved between maize and sorghum. In addition, we generated comprehensive and high-resolution maps of poly(A) sites, revealing similarities and differences in mRNA cleavage between the two species. Overall, our results reveal considerable splicing and expression diversity between sorghum and maize, well beyond what was reported in previous studies, likely reflecting the differences in architecture between these two species.

Assuntos

Processamento Alternativo/genética , Sorghum/genética , Zea mays/genética , Endosperma/genética , Endosperma/crescimento & desenvolvimento , Regulação da Expressão Gênica de Plantas , Genoma de Planta/genética

9.

Complex rearrangements and oncogene amplifications revealed by long-read DNA and RNA sequencing of a breast cancer cell line.

Nattestad, Maria; Goodwin, Sara; Ng, Karen; Baslan, Timour; Sedlazeck, Fritz J; Rescheneder, Philipp; Garvin, Tyler; Fang, Han; Gurtowski, James; Hutton, Elizabeth; Tseng, Elizabeth; Chin, Chen-Shan; Beck, Timothy; Sundaravadanam, Yogi; Kramer, Melissa; Antoniou, Eric; McPherson, John D; Hicks, James; McCombie, W Richard; Schatz, Michael C.

Genome Res ; 28(8): 1126-1135, 2018 08.

Artigo em Inglês | MEDLINE | ID: mdl-29954844

RESUMO

The SK-BR-3 cell line is one of the most important models for HER2+ breast cancers, which affect one in five breast cancer patients. SK-BR-3 is known to be highly rearranged, although much of the variation is in complex and repetitive regions that may be underreported. Addressing this, we sequenced SK-BR-3 using long-read single molecule sequencing from Pacific Biosciences and develop one of the most detailed maps of structural variations (SVs) in a cancer genome available, with nearly 20,000 variants present, most of which were missed by short-read sequencing. Surrounding the important ERBB2 oncogene (also known as HER2), we discover a complex sequence of nested duplications and translocations, suggesting a punctuated progression. Full-length transcriptome sequencing further revealed several novel gene fusions within the nested genomic variants. Combining long-read genome and transcriptome sequencing enables an in-depth analysis of how SVs disrupt the genome and sheds new light on the complex mechanisms involved in cancer genome evolution.

Assuntos

Neoplasias da Mama/genética , Amplificação de Genes/genética , Rearranjo Gênico/genética , Oncogenes/genética , Neoplasias da Mama/patologia , Feminino , Genoma Humano , Variação Estrutural do Genoma , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Células MCF-7 , Receptor ErbB-2/genética , Sequências Repetitivas de Ácido Nucleico/genética , Transcriptoma/genética

10.

Oxford Nanopore sequencing, hybrid error correction, and de novo assembly of a eukaryotic genome.

Goodwin, Sara; Gurtowski, James; Ethe-Sayers, Scott; Deshpande, Panchajanya; Schatz, Michael C; McCombie, W Richard.

Genome Res ; 25(11): 1750-6, 2015 Nov.

Artigo em Inglês | MEDLINE | ID: mdl-26447147

RESUMO

Monitoring the progress of DNA molecules through a membrane pore has been postulated as a method for sequencing DNA for several decades. Recently, a nanopore-based sequencing instrument, the Oxford Nanopore MinION, has become available, and we used this for sequencing the Saccharomyces cerevisiae genome. To make use of these data, we developed a novel open-source hybrid error correction algorithm Nanocorr specifically for Oxford Nanopore reads, because existing packages were incapable of assembling the long read lengths (5-50 kbp) at such high error rates (between â¼5% and 40% error). With this new method, we were able to perform a hybrid error correction of the nanopore reads using complementary MiSeq data and produce a de novo assembly that is highly contiguous and accurate: The contig N50 length is more than ten times greater than an Illumina-only assembly (678 kb versus 59.9 kbp) and has >99.88% consensus identity when compared to the reference. Furthermore, the assembly with the long nanopore reads presents a much more complete representation of the features of the genome and correctly assembles gene cassettes, rRNAs, transposable elements, and other genomic features that were almost entirely absent in the Illumina-only assembly.

Assuntos

DNA Fúngico/isolamento & purificação , Nanoporos , Saccharomyces cerevisiae/genética , Análise de Sequência de DNA/métodos , Elementos de DNA Transponíveis , DNA Fúngico/genética , Escherichia coli/genética , Genômica , Alinhamento de Sequência

11.

Next-generation sequencing as input for chemometrics in differential sensing routines.

Goodwin, Sara; Gade, Alexandra M; Byrom, Michelle; Herrera, Baine; Spears, Camille; Anslyn, Eric V; Ellington, Andrew D.

Angew Chem Int Ed Engl ; 54(21): 6339-42, 2015 May 18.

Artigo em Inglês | MEDLINE | ID: mdl-25826754

RESUMO

Differential sensing (DS) methods traditionally use spatially arrayed receptors and optical signals to create score plots from multivariate data which classify individual analytes or complex mixtures. Herein, a new approach is described, in which nucleic acid sequences and sequence counts are used as the multivariate data without the necessity of a spatial array. To demonstrate this approach to DS, previously selected aptamers, identified from the literature, were used as semi-specific receptors, Next-Gen DNA sequencing was used to generate data, and cell line differentiation was the test-bed application. The study of a principal component analysis loading plot revealed cross-reactivity between the aptamers. The technique generates high-dimensionality score plots, and should be applicable to any mixture of complex and subtly different analytes for which nucleic acid-based receptors exist.

Assuntos

Aptâmeros de Nucleotídeos/química , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Linhagem Celular , Humanos , Análise Multivariada , Análise de Componente Principal

12.

Direct sequencing of insect symbionts via nanopore adaptive sampling.

Badger, Jonathan H; Giordano, Rosanna; Zimin, Aleksey; Wappel, Robert; Eskipehlivan, Senem M; Muller, Stephanie; Donthu, Ravikiran; Soto-Adames, Felipe; Vieira, Paulo; Zasada, Inga; Goodwin, Sara.

Curr Opin Insect Sci ; 61: 101135, 2024 Feb.

Artigo em Inglês | MEDLINE | ID: mdl-37926187

RESUMO

Insect symbionts can alter their host phenotype and their effects can range from beneficial to pathogenic. Moreover, many insects exhibit co-infections, making their study more challenging. Less than 1% of insect species have high-quality referenced genomes available and fewer still also have their symbionts sequenced. Two methods are commonly used to sequence symbionts: whole-genome sequencing to concomitantly capture the host and bacterial genomes, or isolation of the symbiont's genome before sequencing. These methods are limited when dealing with rare or poorly characterized symbionts. Long-read technology is an important tool to generate high-quality genomes as they can overcome high levels of heterozygosity, repeat content, and transposable elements that confound short-read methods. Oxford Nanopore (ONT) adaptive sampling allows a sequencing instrument to select or reject sequences in real time. We describe a method based on ONT adaptive sampling (subtractive) approach that readily permitted the sequencing of the complete genomes of mitochondria, Buchnera and its plasmids (pLeu, pTrp), and Wolbachia genomes in two aphid species, Aphis glycines and Pentalonia nigronervosa. Adaptive sampling is able to retrieve organelles such as mitochondria and symbionts that have high representation in their hosts such as Buchnera and Wolbachia, but is less successful at retrieving symbionts in low concentrations.

Assuntos

Buchnera , Nanoporos , Animais , Buchnera/genética , Elementos de DNA Transponíveis , Insetos/genética

13.

Exploring the genetic and epigenetic underpinnings of early-onset cancers: Variant prioritization for long read whole genome sequencing from family cancer pedigrees.

Kramer, Melissa; Goodwin, Sara; Wappel, Robert; Borio, Matilde; Offit, Kenneth; Feldman, Darren R; Stadler, Zsofia K; McCombie, W Richard.

bioRxiv ; 2024 Jul 02.

Artigo em Inglês | MEDLINE | ID: mdl-39005350

RESUMO

Despite significant advances in our understanding of genetic cancer susceptibility, known inherited cancer predisposition syndromes explain at most 20% of early-onset cancers. As early-onset cancer prevalence continues to increase, the need to assess previously inaccessible areas of the human genome, harnessing a trio or quad family-based architecture for variant filtration, may reveal further insights into cancer susceptibility. To assess a broader spectrum of variation than can be ascertained by multi-gene panel sequencing, or even whole genome sequencing with short reads, we employed long read whole genome sequencing using an Oxford Nanopore Technology (ONT) PromethION of 3 families containing an early-onset cancer proband using a trio or quad family architecture. Analysis included 2 early-onset colorectal cancer family trios and one quad consisting of two siblings with testicular cancer, all with unaffected parents. Structural variants (SVs), epigenetic profiles and single nucleotide variants (SNVs) were determined for each individual, and a filtering strategy was employed to refine and prioritize candidate variants based on the family architecture. The family architecture enabled us to focus on inapposite variants while filtering variants shared with the unaffected parents, significantly decreasing background variation that can hamper identification of potentially disease causing differences. Candidate d e novo and compound heterozygous variants were identified in this way. Gene expression, in matched neoplastic and pre-neoplastic lesions, was assessed for one trio. Our study demonstrates the feasibility of a streamlined analysis of genomic variants from long read ONT whole genome sequencing and a way to prioritize key variants for further evaluation of pathogenicity, while revealing what may be missing from panel based analyses.

14.

Gapless assembly of complete human and plant chromosomes using only nanopore sequencing.

Koren, Sergey; Bao, Zhigui; Guarracino, Andrea; Ou, Shujun; Goodwin, Sara; Jenike, Katharine M; Lucas, Julian; McNulty, Brandy; Park, Jimin; Rautiainen, Mikko; Rhie, Arang; Roelofs, Dick; Schneiders, Harrie; Vrijenhoek, Ilse; Nijbroek, Koen; Ware, Doreen; Schatz, Michael C; Garrison, Erik; Huang, Sanwen; McCombie, W Richard; Miga, Karen H; Wittenberg, Alexander H J; Phillippy, Adam M.

bioRxiv ; 2024 Mar 19.

Artigo em Inglês | MEDLINE | ID: mdl-38529488

RESUMO

The combination of ultra-long Oxford Nanopore (ONT) sequencing reads with long, accurate PacBio HiFi reads has enabled the completion of a human genome and spurred similar efforts to complete the genomes of many other species. However, this approach for complete, "telomere-to-telomere" genome assembly relies on multiple sequencing platforms, limiting its accessibility. ONT "Duplex" sequencing reads, where both strands of the DNA are read to improve quality, promise high per-base accuracy. To evaluate this new data type, we generated ONT Duplex data for three widely-studied genomes: human HG002, Solanum lycopersicum Heinz 1706 (tomato), and Zea mays B73 (maize). For the diploid, heterozygous HG002 genome, we also used "Pore-C" chromatin contact mapping to completely phase the haplotypes. We found the accuracy of Duplex data to be similar to HiFi sequencing, but with read lengths tens of kilobases longer, and the Pore-C data to be compatible with existing diploid assembly algorithms. This combination of read length and accuracy enables the construction of a high-quality initial assembly, which can then be further resolved using the ultra-long reads, and finally phased into chromosome-scale haplotypes with Pore-C. The resulting assemblies have a base accuracy exceeding 99.999% (Q50) and near-perfect continuity, with most chromosomes assembled as single contigs. We conclude that ONT sequencing is a viable alternative to HiFi sequencing for de novo genome assembly, and has the potential to provide a single-instrument solution for the reconstruction of complete genomes.

15.

Mapping medically relevant RNA isoform diversity in the aged human frontal cortex with deep long-read RNA-seq.

Aguzzoli Heberle, Bernardo; Brandon, J Anthony; Page, Madeline L; Nations, Kayla A; Dikobe, Ketsile I; White, Brendan J; Gordon, Lacey A; Fox, Grant A; Wadsworth, Mark E; Doyle, Patricia H; Williams, Brittney A; Fox, Edward J; Shantaraman, Anantharaman; Ryten, Mina; Goodwin, Sara; Ghiban, Elena; Wappel, Robert; Mavruk-Eskipehlivan, Senem; Miller, Justin B; Seyfried, Nicholas T; Nelson, Peter T; Fryer, John D; Ebbert, Mark T W.

Nat Biotechnol ; 2024 May 22.

Artigo em Inglês | MEDLINE | ID: mdl-38778214

RESUMO

Determining whether the RNA isoforms from medically relevant genes have distinct functions could facilitate direct targeting of RNA isoforms for disease treatment. Here, as a step toward this goal for neurological diseases, we sequenced 12 postmortem, aged human frontal cortices (6 Alzheimer disease cases and 6 controls; 50% female) using one Oxford Nanopore PromethION flow cell per sample. We identified 1,917 medically relevant genes expressing multiple isoforms in the frontal cortex where 1,018 had multiple isoforms with different protein-coding sequences. Of these 1,018 genes, 57 are implicated in brain-related diseases including major depression, schizophrenia, Parkinson's disease and Alzheimer disease. Our study also uncovered 53 new RNA isoforms in medically relevant genes, including several where the new isoform was one of the most highly expressed for that gene. We also reported on five mitochondrially encoded, spliced RNA isoforms. We found 99 differentially expressed RNA isoforms between cases with Alzheimer disease and controls.

16.

Nanopore sequencing of 1000 Genomes Project samples to build a comprehensive catalog of human genetic variation.

Gustafson, Jonas A; Gibson, Sophia B; Damaraju, Nikhita; Zalusky, Miranda Pg; Hoekzema, Kendra; Twesigomwe, David; Yang, Lei; Snead, Anthony A; Richmond, Phillip A; De Coster, Wouter; Olson, Nathan D; Guarracino, Andrea; Li, Qiuhui; Miller, Angela L; Goffena, Joy; Anderson, Zachery; Storz, Sophie Hr; Ward, Sydney A; Sinha, Maisha; Gonzaga-Jauregui, Claudia; Clarke, Wayne E; Basile, Anna O; Corvelo, André; Reeves, Catherine; Helland, Adrienne; Musunuri, Rajeeva Lochan; Revsine, Mahler; Patterson, Karynne E; Paschal, Cate R; Zakarian, Christina; Goodwin, Sara; Jensen, Tanner D; Robb, Esther; McCombie, W Richard; Sedlazeck, Fritz J; Zook, Justin M; Montgomery, Stephen B; Garrison, Erik; Kolmogorov, Mikhail; Schatz, Michael C; McLaughlin, Richard N; Dashnow, Harriet; Zody, Michael C; Loose, Matt; Jain, Miten; Eichler, Evan E; Miller, Danny E.

medRxiv ; 2024 Mar 07.

Artigo em Inglês | MEDLINE | ID: mdl-38496498

RESUMO

Less than half of individuals with a suspected Mendelian condition receive a precise molecular diagnosis after comprehensive clinical genetic testing. Improvements in data quality and costs have heightened interest in using long-read sequencing (LRS) to streamline clinical genomic testing, but the absence of control datasets for variant filtering and prioritization has made tertiary analysis of LRS data challenging. To address this, the 1000 Genomes Project ONT Sequencing Consortium aims to generate LRS data from at least 800 of the 1000 Genomes Project samples. Our goal is to use LRS to identify a broader spectrum of variation so we may improve our understanding of normal patterns of human variation. Here, we present data from analysis of the first 100 samples, representing all 5 superpopulations and 19 subpopulations. These samples, sequenced to an average depth of coverage of 37x and sequence read N50 of 54 kbp, have high concordance with previous studies for identifying single nucleotide and indel variants outside of homopolymer regions. Using multiple structural variant (SV) callers, we identify an average of 24,543 high-confidence SVs per genome, including shared and private SVs likely to disrupt gene function as well as pathogenic expansions within disease-associated repeats that were not detected using short reads. Evaluation of methylation signatures revealed expected patterns at known imprinted loci, samples with skewed X-inactivation patterns, and novel differentially methylated regions. All raw sequencing data, processed data, and summary statistics are publicly available, providing a valuable resource for the clinical genetics community to discover pathogenic SVs.

17.

Era of gapless plant genomes: innovations in sequencing and mapping technologies revolutionize genomics and breeding.

Gladman, Nicholas; Goodwin, Sara; Chougule, Kapeel; Richard McCombie, William; Ware, Doreen.

Curr Opin Biotechnol ; 79: 102886, 2023 02.

Artigo em Inglês | MEDLINE | ID: mdl-36640454

RESUMO

Whole-genome sequencing and assembly have revolutionized plant genetics and molecular biology over the last two decades. However, significant shortcomings in first- and second-generation technology resulted in imperfect reference genomes: numerous and large gaps of low quality or undeterminable sequence in areas of highly repetitive DNA along with limited chromosomal phasing restricted the ability of researchers to characterize regulatory noncoding elements and genic regions that underwent recent duplication events. Recently, advances in long-read sequencing have resulted in the first gapless, telomere-to-telomere (T2T) assemblies of plant genomes. This leap forward has the potential to increase the speed and confidence of genomics and molecular experimentation while reducing costs for the research community.

Assuntos

Genômica , Melhoramento Vegetal , Análise de Sequência de DNA/métodos , Genômica/métodos , Genoma de Planta/genética , Plantas/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Tecnologia

18.

Long-Read Sequencing Reveals Rapid Evolution of Immunity- and Cancer-Related Genes in Bats.

Scheben, Armin; Mendivil Ramos, Olivia; Kramer, Melissa; Goodwin, Sara; Oppenheim, Sara; Becker, Daniel J; Schatz, Michael C; Simmons, Nancy B; Siepel, Adam; McCombie, W Richard.

Genome Biol Evol ; 15(9)2023 09 04.

Artigo em Inglês | MEDLINE | ID: mdl-37728212

RESUMO

Bats are exceptional among mammals for their powered flight, extended lifespans, and robust immune systems and therefore have been of particular interest in comparative genomics. Using the Oxford Nanopore Technologies long-read platform, we sequenced the genomes of two bat species with key phylogenetic positions, the Jamaican fruit bat (Artibeus jamaicensis) and the Mesoamerican mustached bat (Pteronotus mesoamericanus), and carried out a comprehensive comparative genomic analysis with a diverse collection of bats and other mammals. The high-quality, long-read genome assemblies revealed a contraction of interferon (IFN)-α at the immunity-related type I IFN locus in bats, resulting in a shift in relative IFN-ω and IFN-α copy numbers. Contradicting previous hypotheses of constitutive expression of IFN-α being a feature of the bat immune system, three bat species lost all IFN-α genes. This shift to IFN-ω could contribute to the increased viral tolerance that has made bats a common reservoir for viruses that can be transmitted to humans. Antiviral genes stimulated by type I IFNs also showed evidence of rapid evolution, including a lineage-specific duplication of IFN-induced transmembrane genes and positive selection in IFIT2. In addition, 33 tumor suppressors and 6 DNA-repair genes showed signs of positive selection, perhaps contributing to increased longevity and reduced cancer rates in bats. The robust immune systems of bats rely on both bat-wide and lineage-specific evolution in the immune gene repertoire, suggesting diverse immune strategies. Our study provides new genomic resources for bats and sheds new light on the extraordinary molecular evolution in this critically important group of mammals.

Assuntos

Quirópteros , Neoplasias , Humanos , Animais , Quirópteros/genética , Filogenia , Evolução Molecular , Genômica , Longevidade , Neoplasias/genética , Neoplasias/veterinária

19.

Gene recoding by synonymous mutations creates promiscuous intragenic transcription initiation in mycobacteria.

Hegelmeyer, Nuri K; Parkin, Lia A; Previti, Mary L; Andrade, Joshua; Utama, Raditya; Sejour, Richard J; Gardin, Justin; Muller, Stephanie; Ketchum, Steven; Yurovsky, Alisa; Futcher, Bruce; Goodwin, Sara; Ueberheide, Beatrix; Seeliger, Jessica C.

mBio ; 14(5): e0084123, 2023 Oct 31.

Artigo em Inglês | MEDLINE | ID: mdl-37787543

RESUMO

IMPORTANCE: Mycobacterium tuberculosis (Mtb) is the causative agent of tuberculosis, one of the deadliest infectious diseases worldwide. Previous studies have established that synonymous recoding to introduce rare codon pairings can attenuate viral pathogens. We hypothesized that non-optimal codon pairing could be an effective strategy for attenuating gene expression to create a live vaccine for Mtb. We instead discovered that these synonymous changes enabled the transcription of functional mRNA that initiated in the middle of the open reading frame and from which many smaller protein products were expressed. To our knowledge, this is one of the first reports that synonymous recoding of a gene in any organism can create or induce intragenic transcription start sites.

Assuntos

Mycobacterium , Mutação Silenciosa , Códon , RNA Mensageiro , Mycobacterium/genética

20.

Gene recoding by synonymous mutations creates promiscuous intragenic transcription initiation in mycobacteria.

Hegelmeyer, Nuri K; Previti, Mary L; Andrade, Joshua; Utama, Raditya; Sejour, Richard J; Gardin, Justin; Muller, Stephanie; Ketchum, Steven; Yurovsky, Alisa; Futcher, Bruce; Goodwin, Sara; Ueberheide, Beatrix; Seeliger, Jessica C.

bioRxiv ; 2023 Mar 17.

Artigo em Inglês | MEDLINE | ID: mdl-36993691

RESUMO

Each genome encodes some codons more frequently than their synonyms (codon usage bias), but codons are also arranged more frequently into specific pairs (codon pair bias). Recoding viral genomes and yeast or bacterial genes with non-optimal codon pairs has been shown to decrease gene expression. Gene expression is thus importantly regulated not only by the use of particular codons but by their proper juxtaposition. We therefore hypothesized that non-optimal codon pairing could likewise attenuate Mtb genes. We explored the role of codon pair bias by recoding Mtb genes ( rpoB, mmpL3, ndh ) and assessing their expression in the closely related and tractable model organism M. smegmatis . To our surprise, recoding caused the expression of multiple smaller protein isoforms from all three genes. We confirmed that these smaller proteins were not due to protein degradation, but instead issued from new transcription initiation sites positioned within the open reading frame. New transcripts gave rise to intragenic translation initiation sites, which in turn led to the expression of smaller proteins. We next identified the nucleotide changes associated with these new sites of transcription and translation. Our results demonstrated that apparently benign, synonymous changes can drastically alter gene expression in mycobacteria. More generally, our work expands our understanding of the codon-level parameters that control translation and transcription initiation. IMPORTANCE: Mycobacterium tuberculosis ( Mtb ) is the causative agent of tuberculosis, one of the deadliest infectious diseases worldwide. Previous studies have established that synonymous recoding to introduce rare codon pairings can attenuate viral pathogens. We hypothesized that non-optimal codon pairing could be an effective strategy for attenuating gene expression to create a live vaccine for Mtb . We instead discovered that these synonymous changes enabled the transcription of functional mRNA that initiated in the middle of the open reading frame and from which many smaller protein products were expressed. To our knowledge, this is the first report that synonymous recoding of a gene in any organism can create or induce intragenic transcription start sites.

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA