Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 15 de 15
Filtrar
1.
Genome Res ; 24(1): 14-24, 2014 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-24092820

RESUMO

Understanding the consequences of regulatory variation in the human genome remains a major challenge, with important implications for understanding gene regulation and interpreting the many disease-risk variants that fall outside of protein-coding regions. Here, we provide a direct window into the regulatory consequences of genetic variation by sequencing RNA from 922 genotyped individuals. We present a comprehensive description of the distribution of regulatory variation--by the specific expression phenotypes altered, the properties of affected genes, and the genomic characteristics of regulatory variants. We detect variants influencing expression of over ten thousand genes, and through the enhanced resolution offered by RNA-sequencing, for the first time we identify thousands of variants associated with specific phenotypes including splicing and allelic expression. Evaluating the effects of both long-range intra-chromosomal and trans (cross-chromosomal) regulation, we observe modularity in the regulatory network, with three-dimensional chromosomal configuration playing a particular role in regulatory modules within each chromosome. We also observe a significant depletion of regulatory variants affecting central and critical genes, along with a trend of reduced effect sizes as variant frequency increases, providing evidence that purifying selection and buffering have limited the deleterious impact of regulatory variation on the cell. Further, generalizing beyond observed variants, we have analyzed the genomic properties of variants associated with expression and splicing and developed a Bayesian model to predict regulatory consequences of genetic variants, applicable to the interpretation of individual genomes and disease studies. Together, these results represent a critical step toward characterizing the complete landscape of human regulatory variation.


Assuntos
Variação Genética , Locos de Características Quantitativas , Análise de Sequência de RNA , Transcriptoma , Teorema de Bayes , Cromossomos Humanos , Genoma Humano , Genótipo , Humanos , Fenótipo , Polimorfismo de Nucleotídeo Único , Sequências Reguladoras de Ácido Ribonucleico
2.
Nature ; 463(7278): 191-6, 2010 Jan 14.
Artigo em Inglês | MEDLINE | ID: mdl-20016485

RESUMO

All cancers carry somatic mutations. A subset of these somatic alterations, termed driver mutations, confer selective growth advantage and are implicated in cancer development, whereas the remainder are passengers. Here we have sequenced the genomes of a malignant melanoma and a lymphoblastoid cell line from the same person, providing the first comprehensive catalogue of somatic mutations from an individual cancer. The catalogue provides remarkable insights into the forces that have shaped this cancer genome. The dominant mutational signature reflects DNA damage due to ultraviolet light exposure, a known risk factor for malignant melanoma, whereas the uneven distribution of mutations across the genome, with a lower prevalence in gene footprints, indicates that DNA repair has been preferentially deployed towards transcribed regions. The results illustrate the power of a cancer genome sequence to reveal traces of the DNA damage, repair, mutation and selection processes that were operative years before the cancer became symptomatic.


Assuntos
Genes Neoplásicos/genética , Genoma Humano/genética , Mutação/genética , Neoplasias/genética , Adulto , Linhagem Celular Tumoral , Dano ao DNA/genética , Análise Mutacional de DNA , Reparo do DNA/genética , Dosagem de Genes/genética , Humanos , Perda de Heterozigosidade/genética , Masculino , Melanoma/etiologia , Melanoma/genética , MicroRNAs/genética , Mutagênese Insercional/genética , Neoplasias/etiologia , Polimorfismo de Nucleotídeo Único/genética , Medicina de Precisão , Deleção de Sequência/genética , Raios Ultravioleta
3.
Nature ; 452(7184): 215-9, 2008 Mar 13.
Artigo em Inglês | MEDLINE | ID: mdl-18278030

RESUMO

Cytosine DNA methylation is important in regulating gene expression and in silencing transposons and other repetitive sequences. Recent genomic studies in Arabidopsis thaliana have revealed that many endogenous genes are methylated either within their promoters or within their transcribed regions, and that gene methylation is highly correlated with transcription levels. However, plants have different types of methylation controlled by different genetic pathways, and detailed information on the methylation status of each cytosine in any given genome is lacking. To this end, we generated a map at single-base-pair resolution of methylated cytosines for Arabidopsis, by combining bisulphite treatment of genomic DNA with ultra-high-throughput sequencing using the Illumina 1G Genome Analyser and Solexa sequencing technology. This approach, termed BS-Seq, unlike previous microarray-based methods, allows one to sensitively measure cytosine methylation on a genome-wide scale within specific sequence contexts. Here we describe methylation on previously inaccessible components of the genome and analyse the DNA methylation sequence composition and distribution. We also describe the effect of various DNA methylation mutants on genome-wide methylation patterns, and demonstrate that our newly developed library construction and computational methods can be applied to large genomes such as that of mouse.


Assuntos
Arabidopsis/genética , Metilação de DNA , Genoma de Planta/genética , Análise de Sequência de DNA/métodos , Sulfitos/metabolismo , 5-Metilcitosina/metabolismo , Animais , Sequência de Bases , Biologia Computacional , Citosina/metabolismo , Regulação da Expressão Gênica de Plantas/genética , Biblioteca Gênica , Camundongos , Mutação/genética , Regiões Promotoras Genéticas/genética , Reprodutibilidade dos Testes , Uracila/metabolismo
4.
Nat Methods ; 5(3): 247-52, 2008 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-18297082

RESUMO

High-density single-nucleotide polymorphism (SNP) arrays have revolutionized the ability of genome-wide association studies to detect genomic regions harboring sequence variants that affect complex traits. Extensive numbers of validated SNPs with known allele frequencies are essential to construct genotyping assays with broad utility. We describe an economical, efficient, single-step method for SNP discovery, validation and characterization that uses deep sequencing of reduced representation libraries (RRLs) from specified target populations. Using nearly 50 million sequences generated on an Illumina Genome Analyzer from DNA of 66 cattle representing three populations, we identified 62,042 putative SNPs and predicted their allele frequencies. Genotype data for these 66 individuals validated 92% of 23,357 selected genome-wide SNPs, with a genotypic and sequence allele frequency correlation of r = 0.67. This approach for simultaneous de novo discovery of high-quality SNPs and population characterization of allele frequencies may be applied to any species with at least a partially sequenced genome.


Assuntos
Biologia Computacional/métodos , Frequência do Gene , Polimorfismo de Nucleotídeo Único , Análise de Sequência de DNA/métodos , Animais , Bovinos , Biblioteca Genômica , Genótipo
5.
Methods Mol Biol ; 354: 105-19, 2007.
Artigo em Inglês | MEDLINE | ID: mdl-17172749

RESUMO

Massively parallel signature sequencing is a sequencing-based method that provides quantitative gene expression data for nearly all transcripts in a particular ribonucleic acid sample. Although the sequencing technology is practiced as a service by a California-based company, we have developed methods for the handling and analysis of these data. This chapter describes the steps involved in obtaining data from massively parallel signature sequencing, aligning the signatures to genomic sequence, identifying novel transcripts, and performing quantitative analyses of genes expressed under conditions such as disease treatments.


Assuntos
Arabidopsis/genética , Arabidopsis/imunologia , Regulação da Expressão Gênica de Plantas , Genes de Plantas/genética , Doenças das Plantas/genética , Análise de Sequência de DNA/métodos , Bases de Dados Genéticas , Biblioteca Gênica , Doenças das Plantas/imunologia , RNA Mensageiro/análise , RNA Mensageiro/genética , RNA de Plantas/análise , RNA de Plantas/genética , Interface Usuário-Computador
6.
Nat Biotechnol ; 22(8): 1006-11, 2004 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-15247925

RESUMO

Large-scale sequencing of short mRNA-derived tags can establish the qualitative and quantitative characteristics of a complex transcriptome. We sequenced 12,304,362 tags from five diverse libraries of Arabidopsis thaliana using massively parallel signature sequencing (MPSS). A total of 48,572 distinct signatures, each representing a different transcript, were expressed at significant levels. These signatures were compared to the annotation of the A. thaliana genomic sequence; in the five libraries, this comparison yielded between 17,353 and 18,361 genes with sense expression, and between 5,487 and 8,729 genes with antisense expression. An additional 6,691 MPSS signatures mapped to unannotated regions of the genome. Expression was demonstrated for 1,168 genes for which expression data were previously unknown. Alternative polyadenylation was observed for more than 25% of A. thaliana genes transcribed in these libraries. The MPSS expression data suggest that the A. thaliana transcriptome is complex and contains many as-yet uncharacterized variants of normal coding transcripts.


Assuntos
Proteínas de Arabidopsis/genética , Proteínas de Arabidopsis/metabolismo , Arabidopsis/genética , Arabidopsis/metabolismo , Alinhamento de Sequência/métodos , Análise de Sequência de RNA/métodos , Transcrição Gênica/genética , Metodologias Computacionais , Etiquetas de Sequências Expressas , Perfilação da Expressão Gênica/métodos , Regulação da Expressão Gênica de Plantas/genética , Genoma de Planta , Biblioteca de Peptídeos
7.
BMC Genomics ; 7: 310, 2006 Dec 08.
Artigo em Inglês | MEDLINE | ID: mdl-17156450

RESUMO

BACKGROUND: Rice blast, caused by the fungal pathogen Magnaporthe grisea, is a devastating disease causing tremendous yield loss in rice production. The public availability of the complete genome sequence of M. grisea provides ample opportunities to understand the molecular mechanism of its pathogenesis on rice plants at the transcriptome level. To identify all the expressed genes encoded in the fungal genome, we have analyzed the mycelium and appressorium transcriptomes using massively parallel signature sequencing (MPSS), robust-long serial analysis of gene expression (RL-SAGE) and oligoarray methods. RESULTS: The MPSS analyses identified 12,531 and 12,927 distinct significant tags from mycelia and appressoria, respectively, while the RL-SAGE analysis identified 16,580 distinct significant tags from the mycelial library. When matching these 12,531 mycelial and 12,927 appressorial significant tags to the annotated CDS, 500 bp upstream and 500 bp downstream of CDS, 6,735 unique genes in mycelia and 7,686 unique genes in appressoria were identified. A total of 7,135 mycelium-specific and 7,531 appressorium-specific significant MPSS tags were identified, which correspond to 2,088 and 1,784 annotated genes, respectively, when matching to the same set of reference sequences. Nearly 85% of the significant MPSS tags from mycelia and appressoria and 65% of the significant tags from the RL-SAGE mycelium library matched to the M. grisea genome. MPSS and RL-SAGE methods supported the expression of more than 9,000 genes, representing over 80% of the predicted genes in M. grisea. About 40% of the MPSS tags and 55% of the RL-SAGE tags represent novel transcripts since they had no matches in the existing M. grisea EST collections. Over 19% of the annotated genes were found to produce both sense and antisense tags in the protein-coding region. The oligoarray analysis identified the expression of 3,793 mycelium-specific and 4,652 appressorium-specific genes. A total of 2,430 mycelial genes and 1,886 appressorial genes were identified by both MPSS and oligoarray. CONCLUSION: The comprehensive and deep transcriptome analysis by MPSS and RL-SAGE methods identified many novel sense and antisense transcripts in the M. grisea genome at two important growth stages. The differentially expressed transcripts that were identified, especially those specifically expressed in appressoria, represent a genomic resource useful for gaining a better understanding of the molecular basis of M. grisea pathogenicity. Further analysis of the novel antisense transcripts will provide new insights into the regulation and function of these genes in fungal growth, development and pathogenesis in the host plants.


Assuntos
Regulação Fúngica da Expressão Gênica , Magnaporthe/genética , Análise de Sequência com Séries de Oligonucleotídeos , Transcrição Gênica , DNA Fúngico/genética , Etiquetas de Sequências Expressas , Técnicas Genéticas , Magnaporthe/patogenicidade , Micélio/genética , RNA Antissenso/genética
8.
Methods Mol Biol ; 331: 285-311, 2006.
Artigo em Inglês | MEDLINE | ID: mdl-16881523

RESUMO

Massively parallel signature sequencing is an ultra-high throughput sequencing technology. It can simultaneously sequence millions of sequence tags, and, therefore, is ideal for whole genome analysis. When applied to expression profiling, it reveals almost every transcript in the sample and provides its accurate expression level. This chapter describes the technology and its application in establishing stem cell transcriptome databases.


Assuntos
Bases de Dados Genéticas , Perfilação da Expressão Gênica/métodos , Genômica/métodos , Células-Tronco Pluripotentes/fisiologia , Transcrição Gênica , Técnicas de Cultura de Células/métodos , Biblioteca Gênica , Genoma Humano , Humanos , Células-Tronco Pluripotentes/citologia , Análise de Sequência de DNA/métodos
9.
Nat Genet ; 44(7): 751-9, 2012 Jun 10.
Artigo em Inglês | MEDLINE | ID: mdl-22683710

RESUMO

The molecular pathogenesis of renal cell carcinoma (RCC) is poorly understood. Whole-genome and exome sequencing followed by innovative tumorgraft analyses (to accurately determine mutant allele ratios) identified several putative two-hit tumor suppressor genes, including BAP1. The BAP1 protein, a nuclear deubiquitinase, is inactivated in 15% of clear cell RCCs. BAP1 cofractionates with and binds to HCF-1 in tumorgrafts. Mutations disrupting the HCF-1 binding motif impair BAP1-mediated suppression of cell proliferation but not deubiquitination of monoubiquitinated histone 2A lysine 119 (H2AK119ub1). BAP1 loss sensitizes RCC cells in vitro to genotoxic stress. Notably, mutations in BAP1 and PBRM1 anticorrelate in tumors (P = 3 × 10(-5)), [corrected] and combined loss of BAP1 and PBRM1 in a few RCCs was associated with rhabdoid features (q = 0.0007). BAP1 and PBRM1 regulate seemingly different gene expression programs, and BAP1 loss was associated with high tumor grade (q = 0.0005). Our results establish the foundation for an integrated pathological and molecular genetic classification of RCC, paving the way for subtype-specific treatments exploiting genetic vulnerabilities.


Assuntos
Carcinoma de Células Renais/genética , Carcinoma de Células Renais/patologia , Neoplasias Renais/genética , Neoplasias Renais/patologia , Proteínas Supressoras de Tumor/deficiência , Proteínas Supressoras de Tumor/genética , Ubiquitina Tiolesterase/deficiência , Ubiquitina Tiolesterase/genética , Idoso , Carcinoma de Células Renais/metabolismo , Processos de Crescimento Celular/fisiologia , Células Cultivadas , Proteínas de Ligação a DNA , Exoma , Feminino , Expressão Gênica/genética , Fator C1 de Célula Hospedeira/genética , Fator C1 de Célula Hospedeira/metabolismo , Humanos , Neoplasias Renais/metabolismo , Masculino , Pessoa de Meia-Idade , Mutação , Proteínas Nucleares/genética , Proteínas Nucleares/metabolismo , Domínios e Motivos de Interação entre Proteínas , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo , Proteínas Supressoras de Tumor/metabolismo , Ubiquitina Tiolesterase/metabolismo
10.
Genome Biol ; 11(10): R102, 2010.
Artigo em Inglês | MEDLINE | ID: mdl-20961407

RESUMO

BACKGROUND: A comprehensive transcriptome survey, or gene atlas, provides information essential for a complete understanding of the genomic biology of an organism. We present an atlas of RNA abundance for 92 adult, juvenile and fetal cattle tissues and three cattle cell lines. RESULTS: The Bovine Gene Atlas was generated from 7.2 million unique digital gene expression tag sequences (300.2 million total raw tag sequences), from which 1.59 million unique tag sequences were identified that mapped to the draft bovine genome accounting for 85% of the total raw tag abundance. Filtering these tags yielded 87,764 unique tag sequences that unambiguously mapped to 16,517 annotated protein-coding loci in the draft genome accounting for 45% of the total raw tag abundance. Clustering of tissues based on tag abundance profiles generally confirmed ontology classification based on anatomy. There were 5,429 constitutively expressed loci and 3,445 constitutively expressed unique tag sequences mapping outside annotated gene boundaries that represent a resource for enhancing current gene models. Physical measures such as inferred transcript length or antisense tag abundance identified tissues with atypical transcriptional tag profiles. We report for the first time the tissue-specific variation in the proportion of mitochondrial transcriptional tag abundance. CONCLUSIONS: The Bovine Gene Atlas is the deepest and broadest transcriptome survey of any livestock genome to date. Commonalities and variation in sense and antisense transcript tag profiles identified in different tissues facilitate the examination of the relationship between gene expression, tissue, and gene function.


Assuntos
Bovinos/genética , Etiquetas de Sequências Expressas , Genoma , Anotação de Sequência Molecular , Animais , Bovinos/classificação , Linhagem Celular , Mapeamento Cromossômico , Feminino , Expressão Gênica , Perfilação da Expressão Gênica , Genes Mitocondriais , Masculino , Anotação de Sequência Molecular/métodos , Proteômica
11.
Proc Natl Acad Sci U S A ; 104(7): 2313-8, 2007 Feb 13.
Artigo em Inglês | MEDLINE | ID: mdl-17277080

RESUMO

Compared with understanding of biological shape and form, knowledge is sparse regarding what regulates growth and body size of a species. For example, the genetic and physiological causes of heterosis (hybrid vigor) have remained elusive for nearly a century. Here, we investigate gene-expression patterns underlying growth heterosis in the Pacific oyster (Crassostrea gigas) in two partially inbred (f = 0.375) and two hybrid larval populations produced by a reciprocal cross between the two inbred families. We cloned cDNA and generated 4.5 M sequence tags with massively parallel signature sequencing. The sequences contain 23,274 distinct signatures that are expressed at statistically nonzero levels and show a highly positively skewed distribution with median and modal counts of 9.25 million and 3 transcripts per million, respectively. For nearly half of these signatures, expression level depends on genotype and is predominantly nonadditive (hybrids deviate from the inbred average). Statistical contrasts suggest approximately 350 candidate genes for growth heterosis that exhibit concordant nonadditive expression in reciprocal hybrids; this represents only approximately 1.5% of the >20,000 transcripts. Patterns of gene expression, which include dominance for low expression and even underdominance of expression, are more complex than predicted from classical dominant or overdominant explanations of heterosis. Preliminary identification of ribosomal proteins among candidate genes supports the suggestion from previous studies that efficiency of protein metabolism plays a role in growth heterosis.


Assuntos
Crassostrea/genética , Regulação da Expressão Gênica/fisiologia , Crescimento/genética , Vigor Híbrido , Larva/genética , RNA Mensageiro/análise , Animais , Genoma , Dados de Sequência Molecular , Proteínas Ribossômicas/análise , Proteínas Ribossômicas/genética
12.
Proc Natl Acad Sci U S A ; 104(41): 16245-50, 2007 Oct 09.
Artigo em Inglês | MEDLINE | ID: mdl-17913878

RESUMO

Transcription factors play a key role in integrating and modulating biological information. In this study, we comprehensively measured the changing abundances of mRNAs over a time course of activation of human peripheral-blood-derived mononuclear cells ("macrophages") with lipopolysaccharide. Global and dynamic analysis of transcription factors in response to a physiological stimulus has yet to be achieved in a human system, and our efforts significantly advanced this goal. We used multiple global high-throughput technologies for measuring mRNA levels, including massively parallel signature sequencing and GeneChip microarrays. We identified 92 of 1,288 known human transcription factors as having significantly measurable changes during our 24-h time course. At least 42 of these changes were previously unidentified in this system. Our data demonstrate that some transcription factors operate in a functional range below 10 transcripts per cell, whereas others operate in a range three orders of magnitude greater. The highly reproducible response of many mRNAs indicates feedback control. A broad range of activation kinetics was observed; thus, combinatorial regulation by small subsets of transcription factors would permit almost any timing input to cis-regulatory elements controlling gene transcription.


Assuntos
Leucócitos Mononucleares/efeitos dos fármacos , Leucócitos Mononucleares/metabolismo , Lipopolissacarídeos/farmacologia , Fatores de Transcrição/genética , Expressão Gênica/efeitos dos fármacos , Humanos , Técnicas In Vitro , Macrófagos/efeitos dos fármacos , Macrófagos/metabolismo , Análise de Sequência com Séries de Oligonucleotídeos , RNA Mensageiro/genética , Biologia de Sistemas
13.
Science ; 309(5740): 1567-9, 2005 Sep 02.
Artigo em Inglês | MEDLINE | ID: mdl-16141074

RESUMO

Small RNAs play important regulatory roles in most eukaryotes, but only a small proportion of these molecules have been identified. We sequenced more than two million small RNAs from seedlings and the inflorescence of the model plant Arabidopsis thaliana. Known and new microRNAs (miRNAs) were among the most abundant of the nonredundant set of more than 75,000 sequences, whereas more than half represented lower abundance small interfering RNAs (siRNAs) that match repetitive sequences, intergenic regions, and genes. Individual or clusters of highly regulated small RNAs were readily observed. Targets of antisense RNA or miRNA did not appear to be preferentially associated with siRNAs. Many genomic regions previously considered featureless were found to be sites of numerous small RNAs.


Assuntos
Arabidopsis/genética , Genoma de Planta , MicroRNAs/biossíntese , RNA de Plantas/biossíntese , RNA Interferente Pequeno/biossíntese , Arabidopsis/metabolismo , Mapeamento Cromossômico , Regulação da Expressão Gênica de Plantas , MicroRNAs/química , MicroRNAs/genética , RNA de Plantas/química , RNA de Plantas/genética , RNA Interferente Pequeno/química , RNA Interferente Pequeno/genética , Análise de Sequência de RNA , Transcrição Gênica
14.
Genome Res ; 15(7): 1007-14, 2005 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-15998913

RESUMO

We have used massively parallel signature sequencing (MPSS) to sample the transcriptomes of 32 normal human tissues to an unprecedented depth, thus documenting the patterns of expression of almost 20,000 genes with high sensitivity and specificity. The data confirm the widely held belief that differences in gene expression between cell and tissue types are largely determined by transcripts derived from a limited number of tissue-specific genes, rather than by combinations of more promiscuously expressed genes. Expression of a little more than half of all known human genes seems to account for both the common requirements and the specific functions of the tissues sampled. A classification of tissues based on patterns of gene expression largely reproduces classifications based on anatomical and biochemical properties. The unbiased sampling of the human transcriptome achieved by MPSS supports the idea that most human genes have been mapped, if not functionally characterized. This data set should prove useful for the identification of tissue-specific genes, for the study of global changes induced by pathological conditions, and for the definition of a minimal set of genes necessary for basic cell maintenance. The data are available on the Web at http://mpss.licr.org and http://sgb.lynxgen.com.


Assuntos
Expressão Gênica , Algoritmos , Etiquetas de Sequências Expressas , Perfilação da Expressão Gênica/métodos , Técnicas Genéticas , Humanos , Especificidade de Órgãos , RNA Mensageiro/genética
15.
Genome Res ; 14(8): 1641-53, 2004 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-15289482

RESUMO

We have generated 36,991,173 17-base sequence "signatures" representing transcripts from the model plant Arabidopsis. These data were derived by massively parallel signature sequencing (MPSS) from 14 libraries and comprised 268,132 distinct sequences. Comparable data were also obtained with 20-base signatures. We developed a method for handling these data and for comparing these signatures to the annotated Arabidopsis genome. As part of this procedure, 858,019 potential or "genomic" signatures were extracted from the Arabidopsis genome and classified based on the position and orientation of the signatures relative to annotated genes. A comparison of genomic and expressed signatures matched 67,735 signatures predicted to be derived from distinct transcripts and expressed at significant levels. Expressed signatures were derived from the sense strand of at least 19,088 of 29,084 annotated genes. A comparison of the genomic and expression signatures demonstrated that approximately 7.7% of genomic signatures were underrepresented in the expression data. These genomic signatures contained one of 20 four-base words that were consistently associated with reduced MPSS abundances. More than 89% of the sum of the expressed signature abundances matched the Arabidopsis genome, and many of the unmatched signatures found in high abundances were predicted to match to previously uncharacterized transcripts.


Assuntos
Arabidopsis/genética , Perfilação da Expressão Gênica/métodos , Genoma de Planta , Transcrição Gênica , Sequência de Bases , Biologia Computacional , Etiquetas de Sequências Expressas , Biblioteca Genômica , RNA Mensageiro/genética
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa