Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 57
Filter
1.
Nat Rev Genet ; 25(7): 460-475, 2024 Jul.
Article in English | MEDLINE | ID: mdl-38366034

ABSTRACT

Short tandem repeats (STRs) are highly polymorphic sequences throughout the human genome that are composed of repeated copies of a 1-6-bp motif. Over 1 million variable STR loci are known, some of which regulate gene expression and influence complex traits, such as height. Moreover, variants in at least 60 STR loci cause genetic disorders, including Huntington disease and fragile X syndrome. Accurately identifying and genotyping STR variants is challenging, in particular mapping short reads to repetitive regions and inferring expanded repeat lengths. Recent advances in sequencing technology and computational tools for STR genotyping from sequencing data promise to help overcome this challenge and solve genetically unresolved cases and the 'missing heritability' of polygenic traits. Here, we compare STR genotyping methods, analytical tools and their applications to understand the effect of STR variation on health and disease. We identify emergent opportunities to refine genotyping and quality-control approaches as well as to integrate STRs into variant-calling workflows and large cohort analyses.


Subject(s)
Genome, Human , Microsatellite Repeats , Humans , Microsatellite Repeats/genetics , Sequence Analysis, DNA/methods , Genotyping Techniques/methods , High-Throughput Nucleotide Sequencing/methods , Genotype
2.
Nature ; 624(7992): 602-610, 2023 Dec.
Article in English | MEDLINE | ID: mdl-38093003

ABSTRACT

Indigenous Australians harbour rich and unique genomic diversity. However, Aboriginal and Torres Strait Islander ancestries are historically under-represented in genomics research and almost completely missing from reference datasets1-3. Addressing this representation gap is critical, both to advance our understanding of global human genomic diversity and as a prerequisite for ensuring equitable outcomes in genomic medicine. Here we apply population-scale whole-genome long-read sequencing4 to profile genomic structural variation across four remote Indigenous communities. We uncover an abundance of large insertion-deletion variants (20-49 bp; n = 136,797), structural variants (50 b-50 kb; n = 159,912) and regions of variable copy number (>50 kb; n = 156). The majority of variants are composed of tandem repeat or interspersed mobile element sequences (up to 90%) and have not been previously annotated (up to 62%). A large fraction of structural variants appear to be exclusive to Indigenous Australians (12% lower-bound estimate) and most of these are found in only a single community, underscoring the need for broad and deep sampling to achieve a comprehensive catalogue of genomic structural variation across the Australian continent. Finally, we explore short tandem repeats throughout the genome to characterize allelic diversity at 50 known disease loci5, uncover hundreds of novel repeat expansion sites within protein-coding genes, and identify unique patterns of diversity and constraint among short tandem repeat sequences. Our study sheds new light on the dimensions and dynamics of genomic structural variation within and beyond Australia.


Subject(s)
Australian Aboriginal and Torres Strait Islander Peoples , Genome, Human , Genomic Structural Variation , Humans , Alleles , Australia/ethnology , Australian Aboriginal and Torres Strait Islander Peoples/genetics , Datasets as Topic , DNA Copy Number Variations/genetics , Genetic Loci/genetics , Genetics, Medical , Genomic Structural Variation/genetics , Genomics , INDEL Mutation/genetics , Interspersed Repetitive Sequences/genetics , Microsatellite Repeats/genetics , Genome, Human/genetics
3.
Nature ; 608(7924): 757-765, 2022 08.
Article in English | MEDLINE | ID: mdl-35948641

ABSTRACT

The notion that mobile units of nucleic acid known as transposable elements can operate as genomic controlling elements was put forward over six decades ago1,2. However, it was not until the advancement of genomic sequencing technologies that the abundance and repertoire of transposable elements were revealed, and they are now known to constitute up to two-thirds of mammalian genomes3,4. The presence of DNA regulatory regions including promoters, enhancers and transcription-factor-binding sites within transposable elements5-8 has led to the hypothesis that transposable elements have been co-opted to regulate mammalian gene expression and cell phenotype8-14. Mammalian transposable elements include recent acquisitions and ancient transposable elements that have been maintained in the genome over evolutionary time. The presence of ancient conserved transposable elements correlates positively with the likelihood of a regulatory function, but functional validation remains an essential step to identify transposable element insertions that have a positive effect on fitness. Here we show that CRISPR-Cas9-mediated deletion of a transposable element-namely the LINE-1 retrotransposon Lx9c11-in mice results in an exaggerated and lethal immune response to virus infection. Lx9c11 is critical for the neogenesis of a non-coding RNA (Lx9c11-RegoS) that regulates genes of the Schlafen family, reduces the hyperinflammatory phenotype and rescues lethality in virus-infected Lx9c11-/- mice. These findings provide evidence that a transposable element can control the immune system to favour host survival during virus infection.


Subject(s)
DNA Transposable Elements , Host Microbial Interactions , Immunity , Retroelements , Virus Diseases , Animals , CRISPR-Cas Systems/genetics , DNA Transposable Elements/genetics , DNA Transposable Elements/immunology , Evolution, Molecular , Host Microbial Interactions/genetics , Host Microbial Interactions/immunology , Immunity/genetics , Mice , RNA, Untranslated/genetics , Regulatory Sequences, Nucleic Acid/genetics , Retroelements/genetics , Retroelements/immunology , Virus Diseases/genetics , Virus Diseases/immunology
4.
Genome Res ; 34(5): 778-783, 2024 Jun 25.
Article in English | MEDLINE | ID: mdl-38692839

ABSTRACT

In silico simulation of high-throughput sequencing data is a technique used widely in the genomics field. However, there is currently a lack of effective tools for creating simulated data from nanopore sequencing devices, which measure DNA or RNA molecules in the form of time-series current signal data. Here, we introduce Squigulator, a fast and simple tool for simulation of realistic nanopore signal data. Squigulator takes a reference genome, a transcriptome, or read sequences, and generates corresponding raw nanopore signal data. This is compatible with basecalling software from Oxford Nanopore Technologies (ONT) and other third-party tools, thereby providing a useful substrate for development, testing, debugging, validation, and optimization at every stage of a nanopore analysis workflow. The user may generate data with preset parameters emulating specific ONT protocols or noise-free "ideal" data, or they may deterministically modify a range of experimental variables and/or noise parameters to shape the data to their needs. We present a brief example of Squigulator's use, creating simulated data to model the degree to which different parameters impact the accuracy of ONT basecalling and downstream variant detection. This analysis reveals new insights into the nature of ONT data and basecalling algorithms. We provide Squigulator as an open-source tool for the nanopore community.


Subject(s)
Nanopore Sequencing , Software , Nanopore Sequencing/methods , Computer Simulation , High-Throughput Nucleotide Sequencing/methods , Nanopores , Humans , Genomics/methods , Sequence Analysis, DNA/methods , Algorithms
5.
Proc Natl Acad Sci U S A ; 119(4)2022 01 25.
Article in English | MEDLINE | ID: mdl-35074916

ABSTRACT

Pogona vitticeps has female heterogamety (ZZ/ZW), but the master sex-determining gene is unknown, as it is for all reptiles. We show that nr5a1 (Nuclear Receptor Subfamily 5 Group A Member 1), a gene that is essential in mammalian sex determination, has alleles on the Z and W chromosomes (Z-nr5a1 and W-nr5a1), which are both expressed and can recombine. Three transcript isoforms of Z-nr5a1 were detected in gonads of adult ZZ males, two of which encode a functional protein. However, ZW females produced 16 isoforms, most of which contained premature stop codons. The array of transcripts produced by the W-borne allele (W-nr5a1) is likely to produce truncated polypeptides that contain a structurally normal DNA-binding domain and could act as a competitive inhibitor to the full-length intact protein. We hypothesize that an altered configuration of the W chromosome affects the conformation of the primary transcript generating inhibitory W-borne isoforms that suppress testis determination. Under this hypothesis, the genetic sex determination (GSD) system of P. vitticeps is a W-borne dominant female-determining gene that may be controlled epigenetically.


Subject(s)
Alleles , Chromosomes/genetics , RNA Splicing , Sex Determination Processes , Steroidogenic Factor 1/genetics , Amino Acid Sequence , Animals , Chromosomes/chemistry , Female , Gene Dosage , Lizards , Male , Models, Molecular , Molecular Conformation , Protein Conformation , Reptiles , Sex Chromosomes , Sex Factors , Steroidogenic Factor 1/chemistry , Structure-Activity Relationship
6.
J Virol ; 97(11): e0070523, 2023 Nov 30.
Article in English | MEDLINE | ID: mdl-37843370

ABSTRACT

IMPORTANCE: The lack of a reliable method to accurately detect when replication-competent HIV has been cleared is a major challenge in developing a cure. This study introduces a new approach called the HIVepsilon-seq (HIVε-seq) assay, which uses long-read sequencing technology and bioinformatics to scrutinize the HIV genome at the nucleotide level, distinguishing between defective and intact HIV. This study included 30 participants on antiretroviral therapy, including 17 women, and was able to discriminate between defective and genetically intact viruses at the single DNA strand level. The HIVε-seq assay is an improvement over previous methods, as it requires minimal sample, less specialized lab equipment, and offers a shorter turnaround time. The HIVε-seq assay offers a promising new tool for researchers to measure the intact HIV reservoir, advancing efforts towards finding a cure for this devastating disease.


Subject(s)
HIV Infections , HIV , Proviruses , Female , Humans , CD4-Positive T-Lymphocytes , DNA, Viral/genetics , HIV Infections/drug therapy , HIV Infections/epidemiology , HIV Infections/virology , Nucleotides , Proviruses/genetics , Viral Load , Sequence Analysis, DNA , Male , Sex Factors , HIV/genetics
7.
Bioinformatics ; 39(6)2023 06 01.
Article in English | MEDLINE | ID: mdl-37252813

ABSTRACT

MOTIVATION: Nanopore sequencing is emerging as a key pillar in the genomic technology landscape but computational constraints limiting its scalability remain to be overcome. The translation of raw current signal data into DNA or RNA sequence reads, known as 'basecalling', is a major friction in any nanopore sequencing workflow. Here, we exploit the advantages of the recently developed signal data format 'SLOW5' to streamline and accelerate nanopore basecalling on high-performance computing (HPC) and cloud environments. RESULTS: SLOW5 permits highly efficient sequential data access, eliminating a potential analysis bottleneck. To take advantage of this, we introduce Buttery-eel, an open-source wrapper for Oxford Nanopore's Guppy basecaller that enables SLOW5 data access, resulting in performance improvements that are essential for scalable, affordable basecalling. AVAILABILITY AND IMPLEMENTATION: Buttery-eel is available at https://github.com/Psy-Fer/buttery-eel.


Subject(s)
Nanopores , Software , Sequence Analysis, DNA/methods , Genome , Genomics , High-Throughput Nucleotide Sequencing
8.
Genet Med ; 26(5): 101076, 2024 05.
Article in English | MEDLINE | ID: mdl-38258669

ABSTRACT

PURPOSE: Genome sequencing (GS)-specific diagnostic rates in prospective tightly ascertained exome sequencing (ES)-negative intellectual disability (ID) cohorts have not been reported extensively. METHODS: ES, GS, epigenetic signatures, and long-read sequencing diagnoses were assessed in 74 trios with at least moderate ID. RESULTS: The ES diagnostic yield was 42 of 74 (57%). GS diagnoses were made in 9 of 32 (28%) ES-unresolved families. Repeated ES with a contemporary pipeline on the GS-diagnosed families identified 8 of 9 single-nucleotide variations/copy-number variations undetected in older ES, confirming a GS-unique diagnostic rate of 1 in 32 (3%). Episignatures contributed diagnostic information in 9% with GS corroboration in 1 of 32 (3%) and diagnostic clues in 2 of 32 (6%). A genetic etiology for ID was detected in 51 of 74 (69%) families. Twelve candidate disease genes were identified. Contemporary ES followed by GS cost US$4976 (95% CI: $3704; $6969) per diagnosis and first-line GS at a cost of $7062 (95% CI: $6210; $8475) per diagnosis. CONCLUSION: Performing GS only in ID trios would be cost equivalent to ES if GS were available at $2435, about a 60% reduction from current prices. This study demonstrates that first-line GS achieves higher diagnostic rate than contemporary ES but at a higher cost.


Subject(s)
Exome Sequencing , Exome , Intellectual Disability , Humans , Intellectual Disability/genetics , Intellectual Disability/diagnosis , Male , Female , Exome/genetics , Exome Sequencing/economics , Cohort Studies , Genetic Testing/economics , Genetic Testing/methods , Whole Genome Sequencing/economics , Child , Genome, Human/genetics , DNA Copy Number Variations/genetics , Polymorphism, Single Nucleotide/genetics , Child, Preschool
9.
Cerebellum ; 2024 May 18.
Article in English | MEDLINE | ID: mdl-38760634

ABSTRACT

The hereditary cerebellar ataxias (HCAs) are rare, progressive neurologic disorders caused by variants in many different genes. Inheritance may follow autosomal dominant, autosomal recessive, X-linked or mitochondrial patterns. The list of genes associated with adult-onset cerebellar ataxia is continuously growing, with several new genes discovered in the last few years. This includes short-tandem repeat (STR) expansions in RFC1, causing cerebellar ataxia, neuropathy, vestibular areflexia syndrome (CANVAS), FGF14-GAA causing spinocerebellar ataxia type 27B (SCA27B), and THAP11. In addition, the genetic basis for SCA4, has recently been identified as a STR expansion in ZFHX3. Given the large and growing number of genes, and different gene variant types, the approach to diagnostic testing for adult-onset HCA can be complex. Testing methods include targeted evaluation of STR expansions (e.g. SCAs, Friedreich ataxia, fragile X-associated tremor/ataxia syndrome, dentatorubral-pallidoluysian atrophy), next generation sequencing for conventional variants, which may include targeted gene panels, whole exome, or whole genome sequencing, followed by various potential additional tests. This review proposes a diagnostic approach for clinical testing, highlights the challenges with current testing technologies, and discusses future advances which may overcome these limitations. Implementing long-read sequencing has the potential to transform the diagnostic approach in HCA, with the overall aim to improve the diagnostic yield.

10.
J Peripher Nerv Syst ; 29(2): 262-274, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38860315

ABSTRACT

BACKGROUND: Loss-of-function variants in MME (membrane metalloendopeptidase) are a known cause of recessive Charcot-Marie-Tooth Neuropathy (CMT). A deep intronic variant, MME c.1188+428A>G (NM_000902.5), was identified through whole genome sequencing (WGS) of two Australian families with recessive inheritance of axonal CMT using the seqr platform. MME c.1188+428A>G was detected in a homozygous state in Family 1, and in a compound heterozygous state with a known pathogenic MME variant (c.467del; p.Pro156Leufs*14) in Family 2. AIMS: We aimed to determine the pathogenicity of the MME c.1188+428A>G variant through segregation and splicing analysis. METHODS: The splicing impact of the deep intronic MME variant c.1188+428A>G was assessed using an in vitro exon-trapping assay. RESULTS: The exon-trapping assay demonstrated that the MME c.1188+428A>G variant created a novel splice donor site resulting in the inclusion of an 83 bp pseudoexon between MME exons 12 and 13. The incorporation of the pseudoexon into MME transcript is predicted to lead to a coding frameshift and premature termination codon (PTC) in MME exon 14 (p.Ala397ProfsTer47). This PTC is likely to result in nonsense mediated decay (NMD) of MME transcript leading to a pathogenic loss-of-function. INTERPRETATION: To our knowledge, this is the first report of a pathogenic deep intronic MME variant causing CMT. This is of significance as deep intronic variants are missed using whole exome sequencing screening methods. Individuals with CMT should be reassessed for deep intronic variants, with splicing impacts being considered in relation to the potential pathogenicity of variants.


Subject(s)
Charcot-Marie-Tooth Disease , Introns , Pedigree , RNA Splicing , Humans , Charcot-Marie-Tooth Disease/genetics , Male , Female , RNA Splicing/genetics , Introns/genetics , Metalloendopeptidases/genetics , Adult , Mutation
11.
PLoS Genet ; 17(4): e1009465, 2021 04.
Article in English | MEDLINE | ID: mdl-33857129

ABSTRACT

How temperature determines sex remains unknown. A recent hypothesis proposes that conserved cellular mechanisms (calcium and redox; 'CaRe' status) sense temperature and identify genes and regulatory pathways likely to be involved in driving sexual development. We take advantage of the unique sex determining system of the model organism, Pogona vitticeps, to assess predictions of this hypothesis. P. vitticeps has ZZ male: ZW female sex chromosomes whose influence can be overridden in genetic males by high temperatures, causing male-to-female sex reversal. We compare a developmental transcriptome series of ZWf females and temperature sex reversed ZZf females. We demonstrate that early developmental cascades differ dramatically between genetically driven and thermally driven females, later converging to produce a common outcome (ovaries). We show that genes proposed as regulators of thermosensitive sex determination play a role in temperature sex reversal. Our study greatly advances the search for the mechanisms by which temperature determines sex.


Subject(s)
Lizards/genetics , Sex Chromosomes/genetics , Sex Determination Processes/genetics , Transcriptome/genetics , Animals , Female , Lizards/growth & development , Male , Sex Determination Analysis/methods , Temperature , Transcription, Genetic/genetics
12.
BMC Biol ; 21(1): 284, 2023 12 08.
Article in English | MEDLINE | ID: mdl-38066641

ABSTRACT

BACKGROUND: Sea snakes underwent a complete transition from land to sea within the last ~ 15 million years, yet they remain a conspicuous gap in molecular studies of marine adaptation in vertebrates. RESULTS: Here, we generate four new annotated sea snake genomes, three of these at chromosome-scale (Hydrophis major, H. ornatus and H. curtus), and perform detailed comparative genomic analyses of sea snakes and their closest terrestrial relatives. Phylogenomic analyses highlight the possibility of near-simultaneous speciation at the root of Hydrophis, and synteny maps show intra-chromosomal variations that will be important targets for future adaptation and speciation genomic studies of this system. We then used a strict screen for positive selection in sea snakes (against a background of seven terrestrial snake genomes) to identify genes over-represented in hypoxia adaptation, sensory perception, immune response and morphological development. CONCLUSIONS: We provide the best reference genomes currently available for the prolific and medically important elapid snake radiation. Our analyses highlight the phylogenetic complexity and conserved genome structure within Hydrophis. Positively selected marine-associated genes provide promising candidates for future, functional studies linking genetic signatures to the marine phenotypes of sea snakes and other vertebrates.


Subject(s)
Elapidae , Hydrophiidae , Animals , Elapidae/genetics , Hydrophiidae/genetics , Phylogeny , Chromosomes/genetics
13.
BMC Genomics ; 24(1): 243, 2023 May 05.
Article in English | MEDLINE | ID: mdl-37147622

ABSTRACT

BACKGROUND: Sex determination is the process whereby the bipotential embryonic gonads become committed to differentiate into testes or ovaries. In genetic sex determination (GSD), the sex determining trigger is encoded by a gene on the sex chromosomes, which activates a network of downstream genes; in mammals these include SOX9, AMH and DMRT1 in the male pathway, and FOXL2 in the female pathway. Although mammalian and avian GSD systems have been well studied, few data are available for reptilian GSD systems. RESULTS: We conducted an unbiased transcriptome-wide analysis of gonad development throughout differentiation in central bearded dragon (Pogona vitticeps) embryos with GSD. We found that sex differentiation of transcriptomic profiles occurs at a very early stage, before the gonad consolidates as a body distinct from the gonad-kidney complex. The male pathway genes dmrt1 and amh and the female pathway gene foxl2 play a key role in early sex differentiation in P. vitticeps, but the central player of the mammalian male trajectory, sox9, is not differentially expressed in P. vitticeps at the bipotential stage. The most striking difference from GSD systems of other amniotes is the high expression of the male pathway genes amh and sox9 in female gonads during development. We propose that a default male trajectory progresses if not repressed by a W-linked dominant gene that tips the balance of gene expression towards the female trajectory. Further, weighted gene expression correlation network analysis revealed novel candidates for male and female sex differentiation. CONCLUSION: Our data reveal that interpretation of putative mechanisms of GSD in reptiles cannot solely depend on lessons drawn from mammals.


Subject(s)
Reptiles , Sex Determination Processes , Sex Differentiation , Animals , Female , Male , Gene Expression , Gene Expression Regulation, Developmental , Gonads/metabolism , Reptiles/genetics , Sex Determination Processes/genetics , Sex Differentiation/genetics , SOX9 Transcription Factor/genetics
14.
Bioinformatics ; 38(5): 1443-1446, 2022 02 07.
Article in English | MEDLINE | ID: mdl-34908106

ABSTRACT

MOTIVATION: InterARTIC is an interactive web application for the analysis of viral whole-genome sequencing (WGS) data generated on Oxford Nanopore Technologies (ONT) devices. A graphical interface enables users with no bioinformatics expertise to analyze WGS experiments and reconstruct consensus genome sequences from individual isolates of viruses, such as SARS-CoV-2. InterARTIC is intended to facilitate widespread adoption and standardization of ONT sequencing for viral surveillance and molecular epidemiology. RESULTS: We demonstrate the use of InterARTIC for the analysis of ONT viral WGS data from SARS-CoV-2 and Ebola virus, using a laptop computer or the internal computer on an ONT GridION sequencing device. We showcase the intuitive graphical interface, workflow customization capabilities and job-scheduling system that facilitate execution of small- and large-scale WGS projects on any common virus. AVAILABILITY AND IMPLEMENTATION: InterARTIC is a free, open-source web application implemented in Python that executes best-practice command line workflows from the ARTIC network. The application can be downloaded as a set of pre-compiled binaries that are compatible with all common Linux distributions, Windows with Linux subsystems, MacOSX and ARM systems. All code can be found on GitHub at https://github.com/Psy-Fer/interARTIC/ and documentation can be found at https://github.com/Psy-Fer/interARTIC/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
COVID-19 , Nanopore Sequencing , Nanopores , Humans , SARS-CoV-2/genetics , Software , Genome, Viral
15.
Nat Rev Genet ; 18(8): 473-484, 2017 08.
Article in English | MEDLINE | ID: mdl-28626224

ABSTRACT

Next-generation sequencing (NGS) provides a broad investigation of the genome, and it is being readily applied for the diagnosis of disease-associated genetic features. However, the interpretation of NGS data remains challenging owing to the size and complexity of the genome and the technical errors that are introduced during sample preparation, sequencing and analysis. These errors can be understood and mitigated through the use of reference standards - well-characterized genetic materials or synthetic spike-in controls that help to calibrate NGS measurements and to evaluate diagnostic performance. The informed use of reference standards, and associated statistical principles, ensures rigorous analysis of NGS data and is essential for its future clinical use.


Subject(s)
High-Throughput Nucleotide Sequencing/standards , Sequence Analysis, DNA/standards , Animals , Humans , Reference Standards
16.
J Peripher Nerv Syst ; 27(2): 120-126, 2022 06.
Article in English | MEDLINE | ID: mdl-35224818

ABSTRACT

Biallelic mutations in sorbitol dehydrogenase (SORD) have been recently identified as a common cause of recessive axonal Charcot-Marie-Tooth neuropathy (CMT2). We aimed to assess a novel long-read sequencing approach to overcome current limitations in SORD neuropathy diagnostics due to the SORD2P pseudogene and the phasing of biallelic mutations in recessive disease. We conducted a screen of our Australian whole exome sequencing (WES) CMT cohort to identify individuals with homozygous or compound heterozygous SORD variants. Individuals detected with SORD mutations then underwent long-read sequencing, clinical assessment, and serum sorbitol analysis. An individual was detected with compound heterozygous truncating mutations in SORD exon 7, NM_003104.5:c.625C>T (p.Arg209Ter) and NM_003104.5:c.757del (p.Ala253GlnfsTer27). Subsequent Oxford Nanopore Tech (ONT) long-read sequencing was used to successfully differentiate SORD from the highly homologous non-functional SORD2P pseudogene and confirmed that the mutations were biallelic through haplotype-resolved analysis. The patient presented with axonal sensorimotor polyneuropathy (CMT2) and ulnar neuropathy without compression at the elbow. Burning neuropathic pain in the forearms and feet was also reported and was exacerbated by alcohol consumption and improved with alcohol cessation. UPLC-tandem mass spectrometry confirmed that the patient had elevated serum sorbitol levels (12.0 mg/L) consistent with levels previously observed in patients with biallelic SORD mutations. This represents a novel clinical presentation and expands the phenotype associated with biallelic SORD mutations causing CMT2. Our study is the first report of long-read sequencing for an individual with CMT and demonstrates the utility of this approach for clinical genomics.


Subject(s)
Charcot-Marie-Tooth Disease , L-Iditol 2-Dehydrogenase , Australia , Charcot-Marie-Tooth Disease/diagnosis , Charcot-Marie-Tooth Disease/genetics , Humans , L-Iditol 2-Dehydrogenase/genetics , Mutation , Pedigree , Phenotype , Sorbitol , Exome Sequencing
17.
Trends Genet ; 33(7): 464-478, 2017 07.
Article in English | MEDLINE | ID: mdl-28535931

ABSTRACT

The combination of pervasive transcription and prolific alternative splicing produces a mammalian transcriptome of great breadth and diversity. The majority of transcribed genomic bases are intronic, antisense, or intergenic to protein-coding genes, yielding a plethora of short and long non-protein-coding regulatory RNAs. Long noncoding RNAs (lncRNAs) share most aspects of their biogenesis, processing, and regulation with mRNAs. However, lncRNAs are typically expressed in more restricted patterns, frequently from enhancers, and exhibit almost universal alternative splicing. These features are consistent with their role as modular epigenetic regulators. We describe here the key studies and technological advances that have shaped our understanding of the dimensions, dynamics, and biological relevance of the mammalian noncoding transcriptome.


Subject(s)
RNA, Untranslated/genetics , Transcriptome , Alternative Splicing , Animals , Exons , Humans
18.
Nat Methods ; 13(9): 784-91, 2016 09.
Article in English | MEDLINE | ID: mdl-27502217

ABSTRACT

The identification of genetic variation with next-generation sequencing is confounded by the complexity of the human genome sequence and by biases that arise during library preparation, sequencing and analysis. We have developed a set of synthetic DNA standards, termed 'sequins', that emulate human genetic features and constitute qualitative and quantitative spike-in controls for genome sequencing. Sequencing reads derived from sequins align exclusively to an artificial in silico reference chromosome, rather than the human reference genome, which allows them them to be partitioned for parallel analysis. Here we use this approach to represent common and clinically relevant genetic variation, ranging from single nucleotide variants to large structural rearrangements and copy-number variation. We validate the design and performance of sequin standards by comparison to examples in the NA12878 reference genome, and we demonstrate their utility during the detection and quantification of variants. We provide sequins as a standardized, quantitative resource against which human genetic variation can be measured and diagnostic performance assessed.


Subject(s)
DNA Copy Number Variations , DNA/genetics , Genome, Human , Genomics/methods , Polymorphism, Single Nucleotide , Sequence Analysis, DNA/methods , Chromosomes, Artificial/chemistry , Chromosomes, Artificial/genetics , DNA/chemical synthesis , DNA/chemistry , Humans , Reference Standards , Sequence Analysis, DNA/standards
19.
Nat Methods ; 13(9): 792-8, 2016 09.
Article in English | MEDLINE | ID: mdl-27502218

ABSTRACT

RNA sequencing (RNA-seq) can be used to assemble spliced isoforms, quantify expressed genes and provide a global profile of the transcriptome. However, the size and diversity of the transcriptome, the wide dynamic range in gene expression and inherent technical biases confound RNA-seq analysis. We have developed a set of spike-in RNA standards, termed 'sequins' (sequencing spike-ins), that represent full-length spliced mRNA isoforms. Sequins have an entirely artificial sequence with no homology to natural reference genomes, but they align to gene loci encoded on an artificial in silico chromosome. The combination of multiple sequins across a range of concentrations emulates alternative splicing and differential gene expression, and it provides scaling factors for normalization between samples. We demonstrate the use of sequins in RNA-seq experiments to measure sample-specific biases and determine the limits of reliable transcript assembly and quantification in accompanying human RNA samples. In addition, we have designed a complementary set of sequins that represent fusion genes arising from rearrangements of the in silico chromosome to aid in cancer diagnosis. RNA sequins provide a qualitative and quantitative reference with which to navigate the complexity of the human transcriptome.


Subject(s)
Gene Expression Profiling/standards , Genes, Synthetic , RNA Splicing , RNA, Messenger/genetics , Sequence Analysis, RNA/standards , Chromosomes, Artificial , Humans , Quality Control , RNA Splicing/genetics , RNA, Messenger/chemical synthesis , RNA, Messenger/chemistry , Reference Standards , Sequence Analysis, RNA/methods
SELECTION OF CITATIONS
SEARCH DETAIL