Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
1.
Bioinformatics ; 35(22): 4754-4756, 2019 11 01.
Artigo em Inglês | MEDLINE | ID: mdl-31134279

RESUMO

SUMMARY: We describe a novel computational method for genotyping repeats using sequence graphs. This method addresses the long-standing need to accurately genotype medically important loci containing repeats adjacent to other variants or imperfect DNA repeats such as polyalanine repeats. Here we introduce a new version of our repeat genotyping software, ExpansionHunter, that uses this method to perform targeted genotyping of a broad class of such loci. AVAILABILITY AND IMPLEMENTATION: ExpansionHunter is implemented in C++ and is available under the Apache License Version 2.0. The source code, documentation, and Linux/macOS binaries are available at https://github.com/Illumina/ExpansionHunter/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Repetições de Microssatélites , Software , Genótipo
2.
Nature ; 489(7414): 101-8, 2012 Sep 06.
Artigo em Inglês | MEDLINE | ID: mdl-22955620

RESUMO

Eukaryotic cells make many types of primary and processed RNAs that are found either in specific subcellular compartments or throughout the cells. A complete catalogue of these RNAs is not yet available and their characteristic subcellular localizations are also poorly understood. Because RNA represents the direct output of the genetic information encoded by genomes and a significant proportion of a cell's regulatory capabilities are focused on its synthesis, processing, transport, modification and translation, the generation of such a catalogue is crucial for understanding genome function. Here we report evidence that three-quarters of the human genome is capable of being transcribed, as well as observations about the range and levels of expression, localization, processing fates, regulatory regions and modifications of almost all currently annotated and thousands of previously unannotated RNAs. These observations, taken together, prompt a redefinition of the concept of a gene.


Assuntos
DNA/genética , Enciclopédias como Assunto , Genoma Humano/genética , Anotação de Sequência Molecular , Sequências Reguladoras de Ácido Nucleico/genética , Transcrição Gênica/genética , Transcriptoma/genética , Alelos , Linhagem Celular , DNA Intergênico/genética , Elementos Facilitadores Genéticos , Éxons/genética , Perfilação da Expressão Gênica , Genes/genética , Genômica , Humanos , Poliadenilação/genética , Isoformas de Proteínas/genética , RNA/biossíntese , RNA/genética , Edição de RNA/genética , Splicing de RNA/genética , Sequências Repetitivas de Ácido Nucleico/genética , Análise de Sequência de RNA
3.
Bioinformatics ; 32(8): 1220-2, 2016 04 15.
Artigo em Inglês | MEDLINE | ID: mdl-26647377

RESUMO

UNLABELLED: : We describe Manta, a method to discover structural variants and indels from next generation sequencing data. Manta is optimized for rapid germline and somatic analysis, calling structural variants, medium-sized indels and large insertions on standard compute hardware in less than a tenth of the time that comparable methods require to identify only subsets of these variant types: for example NA12878 at 50× genomic coverage is analyzed in less than 20 min. Manta can discover and score variants based on supporting paired and split-read evidence, with scoring models optimized for germline analysis of diploid individuals and somatic analysis of tumor-normal sample pairs. Call quality is similar to or better than comparable methods, as determined by pedigree consistency of germline calls and comparison of somatic calls to COSMIC database variants. Manta consistently assembles a higher fraction of its calls to base-pair resolution, allowing for improved downstream annotation and analysis of clinical significance. We provide Manta as a community resource to facilitate practical and routine structural variant analysis in clinical and research sequencing scenarios. AVAILABILITY AND IMPLEMENTATION: Manta is released under the open-source GPLv3 license. Source code, documentation and Linux binaries are available from https://github.com/Illumina/manta. CONTACT: csaunders@illumina.com SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Mutação INDEL , Neoplasias/genética , DNA de Neoplasias , Genoma , Genômica , Humanos , Software
4.
Genome Res ; 23(10): 1601-14, 2013 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-23811145

RESUMO

Deep sequencing of mammalian DNA methylomes has uncovered a previously unpredicted number of discrete hypomethylated regions in intergenic space (iHMRs). Here, we combined whole-genome bisulfite sequencing data with extensive gene expression and chromatin-state data to define functional classes of iHMRs, and to reconstruct the dynamics of their establishment in a developmental setting. Comparing HMR profiles in embryonic stem and primary blood cells, we show that iHMRs mark an exclusive subset of active DNase hypersensitive sites (DHS), and that both developmentally constitutive and cell-type-specific iHMRs display chromatin states typical of distinct regulatory elements. We also observe that iHMR changes are more predictive of nearby gene activity than the promoter HMR itself, and that expression of noncoding RNAs within the iHMR accompanies full activation and complete demethylation of mature B cell enhancers. Conserved sequence features corresponding to iHMR transcript start sites, including a discernible TATA motif, suggest a conserved, functional role for transcription in these regions. Similarly, we explored both primate-specific and human population variation at iHMRs, finding that while enhancer iHMRs are more variable in sequence and methylation status than any other functional class, conservation of the TATA box is highly predictive of iHMR maintenance, reflecting the impact of sequence plasticity and transcriptional signals on iHMR establishment. Overall, our analysis allowed us to construct a three-step timeline in which (1) intergenic DHS are pre-established in the stem cell, (2) partial demethylation of blood-specific intergenic DHSs occurs in blood progenitors, and (3) complete iHMR formation and transcription coincide with enhancer activation in lymphoid-specified cells.


Assuntos
Cromatina/genética , Metilação de DNA , DNA Intergênico/química , RNA não Traduzido/genética , Elementos Reguladores de Transcrição , Animais , Linfócitos B/citologia , Linfócitos B/fisiologia , Diferenciação Celular , Linhagem Celular , Cromatina/metabolismo , Ilhas de CpG , Elementos Facilitadores Genéticos , Evolução Molecular , Feminino , Perfilação da Expressão Gênica , Células-Tronco Hematopoéticas/citologia , Células-Tronco Hematopoéticas/fisiologia , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Linfopoese , Pan troglodytes , Filogenia , Regiões Promotoras Genéticas , Análise de Sequência de DNA , Iniciação da Transcrição Genética
5.
Genome Res ; 21(9): 1543-51, 2011 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-21816910

RESUMO

High-throughput sequencing of cDNA (RNA-seq) is a widely deployed transcriptome profiling and annotation technique, but questions about the performance of different protocols and platforms remain. We used a newly developed pool of 96 synthetic RNAs with various lengths, and GC content covering a 2(20) concentration range as spike-in controls to measure sensitivity, accuracy, and biases in RNA-seq experiments as well as to derive standard curves for quantifying the abundance of transcripts. We observed linearity between read density and RNA input over the entire detection range and excellent agreement between replicates, but we observed significantly larger imprecision than expected under pure Poisson sampling errors. We use the control RNAs to directly measure reproducible protocol-dependent biases due to GC content and transcript length as well as stereotypic heterogeneity in coverage across transcripts correlated with position relative to RNA termini and priming sequence bias. These effects lead to biased quantification for short transcripts and individual exons, which is a serious problem for measurements of isoform abundances, but that can partially be corrected using appropriate models of bias. By using the control RNAs, we derive limits for the discovery and detection of rare transcripts in RNA-seq experiments. By using data collected as part of the model organism and human Encyclopedia of DNA Elements projects (ENCODE and modENCODE), we demonstrate that external RNA controls are a useful resource for evaluating sensitivity and accuracy of RNA-seq experiments for transcriptome discovery and quantification. These quality metrics facilitate comparable analysis across different samples, protocols, and platforms.


Assuntos
RNA/química , Análise de Sequência de RNA/normas , Animais , Viés , Perfilação da Expressão Gênica , Biblioteca Gênica , Sequenciamento de Nucleotídeos em Larga Escala/normas , Humanos , Controle de Qualidade , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
6.
Bioinformatics ; 29(1): 15-21, 2013 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-23104886

RESUMO

MOTIVATION: Accurate alignment of high-throughput RNA-seq data is a challenging and yet unsolved problem because of the non-contiguous transcript structure, relatively short read lengths and constantly increasing throughput of the sequencing technologies. Currently available RNA-seq aligners suffer from high mapping error rates, low mapping speed, read length limitation and mapping biases. RESULTS: To align our large (>80 billon reads) ENCODE Transcriptome RNA-seq dataset, we developed the Spliced Transcripts Alignment to a Reference (STAR) software based on a previously undescribed RNA-seq alignment algorithm that uses sequential maximum mappable seed search in uncompressed suffix arrays followed by seed clustering and stitching procedure. STAR outperforms other aligners by a factor of >50 in mapping speed, aligning to the human genome 550 million 2 × 76 bp paired-end reads per hour on a modest 12-core server, while at the same time improving alignment sensitivity and precision. In addition to unbiased de novo detection of canonical junctions, STAR can discover non-canonical splices and chimeric (fusion) transcripts, and is also capable of mapping full-length RNA sequences. Using Roche 454 sequencing of reverse transcription polymerase chain reaction amplicons, we experimentally validated 1960 novel intergenic splice junctions with an 80-90% success rate, corroborating the high precision of the STAR mapping strategy. AVAILABILITY AND IMPLEMENTATION: STAR is implemented as a standalone C++ code. STAR is free open source software distributed under GPLv3 license and can be downloaded from http://code.google.com/p/rna-star/.


Assuntos
Alinhamento de Sequência/métodos , Software , Algoritmos , Análise por Conglomerados , Perfilação da Expressão Gênica , Genoma Humano , Humanos , Splicing de RNA , Análise de Sequência de RNA/métodos
8.
Genome Biol ; 20(1): 291, 2019 12 19.
Artigo em Inglês | MEDLINE | ID: mdl-31856913

RESUMO

Accurate detection and genotyping of structural variations (SVs) from short-read data is a long-standing area of development in genomics research and clinical sequencing pipelines. We introduce Paragraph, an accurate genotyper that models SVs using sequence graphs and SV annotations. We demonstrate the accuracy of Paragraph on whole-genome sequence data from three samples using long-read SV calls as the truth set, and then apply Paragraph at scale to a cohort of 100 short-read sequenced samples of diverse ancestry. Our analysis shows that Paragraph has better accuracy than other existing genotypers and can be applied to population-scale studies.


Assuntos
Variação Estrutural do Genoma , Técnicas de Genotipagem , Genoma Humano , Humanos
9.
Gastrointest Tumors ; 6(1-2): 11-27, 2019 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-31602373

RESUMO

BACKGROUND: Hepatocellular carcinoma (HCC) is now the second-highest cause of cancer death worldwide. Recent studies have discovered a wide range of somatic mutations in HCC. These mutations involve various vital signaling pathways such as: Wnt/ß-Catenin, p53, telome-rase reverse transcriptase (TERT), chromatin remodeling, RAS/MAPK signaling, and oxidative stress. However, fusion transcripts have not been broadly explored in HCC. METHODS: To identify novel fusion transcripts in HCC, in the first phase of our study, we performed targeted RNA sequencing (in HCC and paired non-HCC tissues) on 6 patients with a diagnosis of HCC undergoing liver transplantation. RESULTS: As a result of these studies, we discovered the novel fusion transcript, VTI1A-CFAP46. In the second phase of our study, we measured the expression of wild-type VTI1A in 21 HCC specimens, which showed that 10 of 21 exhibited upregulation of wild-type VTI1A in their tumors. VTI1A (Vesicle Transport via Interaction with t-SNARE homolog 1A) is a member of the Soluble N-ethylmaleimide-Sensitive Factor (NSF) attachment protein receptor (SNARE) gene family, which is essential for membrane trafficking and function in endocytosis, autophagy, and Golgi transport. Notably, it is known that autophagy is involved in HCC. CONCLUSIONS: The link between novel fusion transcript VTI1A-CFAP46 and autophagy as a potential therapeutic target in HCC patients deserves further investigation. Moreover, this study shows that fusion transcripts are worthy of additional exploration in HCC.

10.
Forensic Sci Int Genet ; 28: 52-70, 2017 05.
Artigo em Inglês | MEDLINE | ID: mdl-28171784

RESUMO

Human DNA profiling using PCR at polymorphic short tandem repeat (STR) loci followed by capillary electrophoresis (CE) size separation and length-based allele typing has been the standard in the forensic community for over 20 years. Over the last decade, Next-Generation Sequencing (NGS) matured rapidly, bringing modern advantages to forensic DNA analysis. The MiSeq FGx™ Forensic Genomics System, comprised of the ForenSeq™ DNA Signature Prep Kit, MiSeq FGx™ Reagent Kit, MiSeq FGx™ instrument and ForenSeq™ Universal Analysis Software, uses PCR to simultaneously amplify up to 231 forensic loci in a single multiplex reaction. Targeted loci include Amelogenin, 27 common, forensic autosomal STRs, 24 Y-STRs, 7 X-STRs and three classes of single nucleotide polymorphisms (SNPs). The ForenSeq™ kit includes two primer sets: Amelogenin, 58 STRs and 94 identity informative SNPs (iiSNPs) are amplified using DNA Primer Set A (DPMA; 153 loci); if a laboratory chooses to generate investigative leads using DNA Primer Set B, amplification is targeted to the 153 loci in DPMA plus 22 phenotypic informative (piSNPs) and 56 biogeographical ancestry SNPs (aiSNPs). High-resolution genotypes, including detection of intra-STR sequence variants, are semi-automatically generated with the ForenSeq™ software. This system was subjected to developmental validation studies according to the 2012 Revised SWGDAM Validation Guidelines. A two-step PCR first amplifies the target forensic STR and SNP loci (PCR1); unique, sample-specific indexed adapters or "barcodes" are attached in PCR2. Approximately 1736 ForenSeq™ reactions were analyzed. Studies include DNA substrate testing (cotton swabs, FTA cards, filter paper), species studies from a range of nonhuman organisms, DNA input sensitivity studies from 1ng down to 7.8pg, two-person human DNA mixture testing with three genotype combinations, stability analysis of partially degraded DNA, and effects of five commonly encountered PCR inhibitors. Calculations from ForenSeq™ STR and SNP repeatability and reproducibility studies (1ng template) indicate 100.0% accuracy of the MiSeq FGx™ System in allele calling relative to CE for STRs (1260 samples), and >99.1% accuracy relative to bead array typing for SNPs (1260 samples for iiSNPs, 310 samples for aiSNPs and piSNPs), with >99.0% and >97.8% precision, respectively. Call rates of >99.0% were observed for all STRs and SNPs amplified with both ForenSeq™ primer mixes. Limitations of the MiSeq FGx™ System are discussed. Results described here demonstrate that the MiSeq FGx™ System meets forensic DNA quality assurance guidelines with robust, reliable, and reproducible performance on samples of various quantities and qualities.


Assuntos
Impressões Digitais de DNA , Sequenciamento de Nucleotídeos em Larga Escala/instrumentação , Amelogenina/genética , Animais , Feminino , Genótipo , Humanos , Masculino , Repetições de Microssatélites , Reação em Cadeia da Polimerase , Polimorfismo de Nucleotídeo Único , Reprodutibilidade dos Testes , Especificidade da Espécie
11.
Nat Genet ; 43(12): 1179-85, 2011 Oct 23.
Artigo em Inglês | MEDLINE | ID: mdl-22019781

RESUMO

Many animal species use a chromosome-based mechanism of sex determination, which has led to the coordinate evolution of dosage-compensation systems. Dosage compensation not only corrects the imbalance in the number of X chromosomes between the sexes but also is hypothesized to correct dosage imbalance within cells that is due to monoallelic X-linked expression and biallelic autosomal expression, by upregulating X-linked genes twofold (termed 'Ohno's hypothesis'). Although this hypothesis is well supported by expression analyses of individual X-linked genes and by microarray-based transcriptome analyses, it was challenged by a recent study using RNA sequencing and proteomics. We obtained new, independent RNA-seq data, measured RNA polymerase distribution and reanalyzed published expression data in mammals, C. elegans and Drosophila. Our analyses, which take into account the skewed gene content of the X chromosome, support the hypothesis of upregulation of expressed X-linked genes to balance expression of the genome.


Assuntos
Caenorhabditis elegans/genética , Drosophila melanogaster/genética , Regulação da Expressão Gênica , Genes Ligados ao Cromossomo X , Animais , Linhagem Celular , Mecanismo Genético de Compensação de Dose , Proteínas de Drosophila/genética , Proteínas de Drosophila/metabolismo , Feminino , Perfilação da Expressão Gênica , Humanos , Masculino , Camundongos , Análise de Sequência com Séries de Oligonucleotídeos , Especificidade de Órgãos , Ovário/metabolismo , RNA Polimerase II/metabolismo , Testículo/metabolismo , Transcrição Gênica , Regulação para Cima , Cromossomo X/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA