Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 12 de 12
Filtrar
1.
Nature ; 470(7332): 59-65, 2011 Feb 03.
Artigo em Inglês | MEDLINE | ID: mdl-21293372

RESUMO

Genomic structural variants (SVs) are abundant in humans, differing from other forms of variation in extent, origin and functional impact. Despite progress in SV characterization, the nucleotide resolution architecture of most SVs remains unknown. We constructed a map of unbalanced SVs (that is, copy number variants) based on whole genome DNA sequencing data from 185 human genomes, integrating evidence from complementary SV discovery approaches with extensive experimental validations. Our map encompassed 22,025 deletions and 6,000 additional SVs, including insertions and tandem duplications. Most SVs (53%) were mapped to nucleotide resolution, which facilitated analysing their origin and functional impact. We examined numerous whole and partial gene deletions with a genotyping approach and observed a depletion of gene disruptions amongst high frequency deletions. Furthermore, we observed differences in the size spectra of SVs originating from distinct formation mechanisms, and constructed a map of SV hotspots formed by common mechanisms. Our analytical framework and SV map serves as a resource for sequencing-based association studies.


Assuntos
Variações do Número de Cópias de DNA/genética , Genética Populacional , Genoma Humano/genética , Genômica , Duplicação Gênica/genética , Predisposição Genética para Doença/genética , Genótipo , Humanos , Mutagênese Insercional/genética , Reprodutibilidade dos Testes , Análise de Sequência de DNA , Deleção de Sequência/genética
2.
Nature ; 463(7278): 184-90, 2010 Jan 14.
Artigo em Inglês | MEDLINE | ID: mdl-20016488

RESUMO

Cancer is driven by mutation. Worldwide, tobacco smoking is the principal lifestyle exposure that causes cancer, exerting carcinogenicity through >60 chemicals that bind and mutate DNA. Using massively parallel sequencing technology, we sequenced a small-cell lung cancer cell line, NCI-H209, to explore the mutational burden associated with tobacco smoking. A total of 22,910 somatic substitutions were identified, including 134 in coding exons. Multiple mutation signatures testify to the cocktail of carcinogens in tobacco smoke and their proclivities for particular bases and surrounding sequence context. Effects of transcription-coupled repair and a second, more general, expression-linked repair pathway were evident. We identified a tandem duplication that duplicates exons 3-8 of CHD7 in frame, and another two lines carrying PVT1-CHD7 fusion genes, indicating that CHD7 may be recurrently rearranged in this disease. These findings illustrate the potential for next-generation sequencing to provide unprecedented insights into mutational processes, cellular repair pathways and gene networks associated with cancer.


Assuntos
Neoplasias Pulmonares/etiologia , Neoplasias Pulmonares/genética , Mutação/genética , Nicotiana/efeitos adversos , Carcinoma de Pequenas Células do Pulmão/etiologia , Carcinoma de Pequenas Células do Pulmão/genética , Fumar/efeitos adversos , Carcinógenos/toxicidade , Linhagem Celular Tumoral , Variações do Número de Cópias de DNA/efeitos dos fármacos , Variações do Número de Cópias de DNA/genética , Dano ao DNA/genética , DNA Helicases/genética , Análise Mutacional de DNA , Reparo do DNA/genética , Proteínas de Ligação a DNA/genética , Éxons/genética , Regulação Neoplásica da Expressão Gênica/efeitos dos fármacos , Genoma Humano/efeitos dos fármacos , Genoma Humano/genética , Humanos , Mutagênese Insercional/efeitos dos fármacos , Mutagênese Insercional/genética , Mutação/efeitos dos fármacos , Regiões Promotoras Genéticas/genética , Deleção de Sequência/genética
3.
Am J Hum Genet ; 91(4): 660-71, 2012 Oct 05.
Artigo em Inglês | MEDLINE | ID: mdl-23040495

RESUMO

Full sequencing of individual human genomes has greatly expanded our understanding of human genetic variation and population history. Here, we present a systematic analysis of 50 human genomes from 11 diverse global populations sequenced at high coverage. Our sample includes 12 individuals who have admixed ancestry and who have varying degrees of recent (within the last 500 years) African, Native American, and European ancestry. We found over 21 million single-nucleotide variants that contribute to a 1.75-fold range in nucleotide heterozygosity across diverse human genomes. This heterozygosity ranged from a high of one heterozygous site per kilobase in west African genomes to a low of 0.57 heterozygous sites per kilobase in segments inferred to have diploid Native American ancestry from the genomes of Mexican and Puerto Rican individuals. We show evidence of all three continental ancestries in the genomes of Mexican, Puerto Rican, and African American populations, and the genome-wide statistics are highly consistent across individuals from a population once ancestry proportions have been accounted for. Using a generalized linear model, we identified subtle variations across populations in the proportion of neutral versus deleterious variation and found that genome-wide statistics vary in admixed populations even once ancestry proportions have been factored in. We further infer that multiple periods of gene flow shaped the diversity of admixed populations in the Americas-70% of the European ancestry in today's African Americans dates back to European gene flow happening only 7-8 generations ago.


Assuntos
Genoma Humano , Haplótipos/genética , População/genética , Grupos Raciais/genética , Genética Populacional/métodos , Heterozigoto , Humanos , Polimorfismo de Nucleotídeo Único
4.
Genome Res ; 20(7): 972-80, 2010 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-20488932

RESUMO

Abnormalities of genomic methylation patterns are lethal or cause disease, but the cues that normally designate CpG dinucleotides for methylation are poorly understood. We have developed a new method of methylation profiling that has single-CpG resolution and can address the methylation status of repeated sequences. We have used this method to determine the methylation status of >275 million CpG sites in human and mouse DNA from breast and brain tissues. Methylation density at most sequences was found to increase linearly with CpG density and to fall sharply at very high CpG densities, but transposons remained densely methylated even at higher CpG densities. The presence of histone H2A.Z and histone H3 di- or trimethylated at lysine 4 correlated strongly with unmethylated DNA and occurred primarily at promoter regions. We conclude that methylation is the default state of most CpG dinucleotides in the mammalian genome and that a combination of local dinucleotide frequencies, the interaction of repeated sequences, and the presence or absence of histone variants or modifications shields a population of CpG sites (most of which are in and around promoters) from DNA methyltransferases that lack intrinsic sequence specificity.


Assuntos
Sequência de Bases/fisiologia , Cromatina/química , Cromatina/fisiologia , Metilação de DNA , Animais , Encéfalo/metabolismo , Mama/metabolismo , Cromatina/genética , Mapeamento Cromossômico , Ilhas de CpG/genética , Feminino , Genoma , Histonas/metabolismo , Humanos , Camundongos , Análise de Sequência de DNA , Estudos de Validação como Assunto
5.
Genome Res ; 19(9): 1527-41, 2009 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-19546169

RESUMO

We describe the genome sequencing of an anonymous individual of African origin using a novel ligation-based sequencing assay that enables a unique form of error correction that improves the raw accuracy of the aligned reads to >99.9%, allowing us to accurately call SNPs with as few as two reads per allele. We collected several billion mate-paired reads yielding approximately 18x haploid coverage of aligned sequence and close to 300x clone coverage. Over 98% of the reference genome is covered with at least one uniquely placed read, and 99.65% is spanned by at least one uniquely placed mate-paired clone. We identify over 3.8 million SNPs, 19% of which are novel. Mate-paired data are used to physically resolve haplotype phases of nearly two-thirds of the genotypes obtained and produce phased segments of up to 215 kb. We detect 226,529 intra-read indels, 5590 indels between mate-paired reads, 91 inversions, and four gene fusions. We use a novel approach for detecting indels between mate-paired reads that are smaller than the standard deviation of the insert size of the library and discover deletions in common with those detected with our intra-read approach. Dozens of mutations previously described in OMIM and hundreds of nonsynonymous single-nucleotide and structural variants in genes previously implicated in disease are identified in this individual. There is more genetic variation in the human genome still to be uncovered, and we provide guidance for future surveys in populations and cancer biopsies.


Assuntos
Pareamento de Bases , Biologia Computacional/métodos , Variação Genética , Genoma Humano , Ligases , Análise de Sequência de DNA/métodos , África , Sequência de Bases , Genômica , Genótipo , Heterozigoto , Homozigoto , Humanos , Polimorfismo de Nucleotídeo Único , Padrões de Referência
6.
Nat Methods ; 5(7): 613-9, 2008 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-18516046

RESUMO

We developed a massive-scale RNA sequencing protocol, short quantitative random RNA libraries or SQRL, to survey the complexity, dynamics and sequence content of transcriptomes in a near-complete fashion. This method generates directional, random-primed, linear cDNA libraries that are optimized for next-generation short-tag sequencing. We surveyed the poly(A)(+) transcriptomes of undifferentiated mouse embryonic stem cells (ESCs) and embryoid bodies (EBs) at an unprecedented depth (10 Gb), using the Applied Biosystems SOLiD technology. These libraries capture the genomic landscape of expression, state-specific expression, single-nucleotide polymorphisms (SNPs), the transcriptional activity of repeat elements, and both known and new alternative splicing events. We investigated the impact of transcriptional complexity on current models of key signaling pathways controlling ESC pluripotency and differentiation, highlighting how SQRL can be used to characterize transcriptome content and dynamics in a quantitative and reproducible manner, and suggesting that our understanding of transcriptional complexity is far from complete.


Assuntos
Células-Tronco Embrionárias/metabolismo , Perfilação da Expressão Gênica/métodos , RNA Mensageiro/genética , Análise de Sequência de RNA/métodos , Animais , Diferenciação Celular , Células-Tronco Embrionárias/citologia , Etiquetas de Sequências Expressas , Perfilação da Expressão Gênica/estatística & dados numéricos , Biblioteca Gênica , Camundongos , Células-Tronco Pluripotentes/citologia , Células-Tronco Pluripotentes/metabolismo , Polimorfismo de Nucleotídeo Único , Sensibilidade e Especificidade , Transdução de Sinais
7.
J Thorac Oncol ; 9(4): 563-6, 2014 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-24736082

RESUMO

INTRODUCTION: Anaplastic lymphoma kinase (ALK) fusion is the most common mechanism for overexpression and activation in non-small-cell lung carcinoma. Several fusion partners of ALK have been reported, including echinoderm microtubule-associated protein-like 4, TRK-fused gene, kinesin family member 5B, kinesin light chain 1 (KLC1), protein tyrosine phosphatase and nonreceptor type 3, and huntingtin interacting protein 1 (HIP1). METHODS AND RESULTS: A 60-year-old Korean man had a lung mass which was a poorly differentiated adenocarcinoma with ALK overexpression. By using an Anchored Multiplex polymerase chain reaction assay and sequencing, we found that tumor had a novel translocated promoter region (TPR)-ALK fusion. The fusion transcript was generated from an intact, in-frame fusion of TPR exon 15 and ALK exon 20 (t(1;2)(q31.1;p23)). The TPR-ALK fusion encodes a predicted protein of 1192 amino acids with a coiled-coil domain encoded by the 5'-2 of the TPR and juxtamembrane and kinase domains encoded by the 3'-end of the ALK. CONCLUSIONS: The novel fusion gene and its protein TRP-ALK, harboring coiled-coil and kinase domains, could possess transforming potential and responses to treatment with ALK inhibitors. This case is the first report of TPR-ALK fusion transcript in clinical tumor samples and could provide a novel diagnostic and therapeutic candidate target for patients with cancer, including non-small-cell lung carcinoma.


Assuntos
Adenocarcinoma/genética , Rearranjo Gênico , Neoplasias Pulmonares/genética , Complexo de Proteínas Formadoras de Poros Nucleares/genética , Proteínas de Fusão Oncogênica/genética , Proteínas Proto-Oncogênicas/genética , Receptores Proteína Tirosina Quinases/genética , Translocação Genética/genética , Adenocarcinoma/tratamento farmacológico , Adenocarcinoma/patologia , Quinase do Linfoma Anaplásico , Protocolos de Quimioterapia Combinada Antineoplásica/uso terapêutico , Éxons/genética , Humanos , Cinesinas , Neoplasias Pulmonares/tratamento farmacológico , Neoplasias Pulmonares/patologia , Masculino , Pessoa de Meia-Idade , Reação em Cadeia da Polimerase , Prognóstico
8.
Biopolymers ; 95(4): 254-69, 2011 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-21280021

RESUMO

The growing numbers of very well resolved nucleic-acid crystal structures with anisotropic displacement parameters provide an unprecedented opportunity to learn about the natural motions of DNA and RNA. Here we report a new Monte-Carlo approach that takes direct account of this information to extract the distortions of covalent structure, base pairing, and dinucleotide geometry intrinsic to regularly organized double-helical molecules. We present new methods to test the validity of the anisotropic parameters and examine the apparent deformability of a variety of structures, including several A, B, and Z DNA duplexes, an AB helical intermediate, an RNA, a ligand-DNA complex, and an enzyme-bound DNA. The rigid-body parameters characterizing the positions of the bases in the structures mirror the mean parameters found when atomic motion is taken into account. The base-pair fluctuations intrinsic to a single structure, however, differ from those extracted from collections of nucleic-acid structures, although selected base-pair steps undergo conformational excursions along routes suggested by the ensembles. The computations reveal surprising new molecular insights, such as the stiffening of DNA and concomitant separation of motions of contacted nucleotides on opposite strands by the binding of Escherichia coli endonuclease VIII, which suggest how the protein may direct enzymatic action.


Assuntos
DNA Forma A/química , DNA Forma Z/química , DNA/química , Conformação de Ácido Nucleico , RNA/química , Anisotropia , Pareamento de Bases , Simulação por Computador , Proteínas de Ligação a DNA/química , Modelos Moleculares , Movimento (Física) , Nucleotídeos/química , Ligação Proteica
9.
Sci Transl Med ; 3(65): 65ra4, 2011 Jan 12.
Artigo em Inglês | MEDLINE | ID: mdl-21228398

RESUMO

Of 7028 disorders with suspected Mendelian inheritance, 1139 are recessive and have an established molecular basis. Although individually uncommon, Mendelian diseases collectively account for ~20% of infant mortality and ~10% of pediatric hospitalizations. Preconception screening, together with genetic counseling of carriers, has resulted in remarkable declines in the incidence of several severe recessive diseases including Tay-Sachs disease and cystic fibrosis. However, extension of preconception screening to most severe disease genes has hitherto been impractical. Here, we report a preconception carrier screen for 448 severe recessive childhood diseases. Rather than costly, complete sequencing of the human genome, 7717 regions from 437 target genes were enriched by hybrid capture or microdroplet polymerase chain reaction, sequenced by next-generation sequencing (NGS) to a depth of up to 2.7 gigabases, and assessed with stringent bioinformatic filters. At a resultant 160x average target coverage, 93% of nucleotides had at least 20x coverage, and mutation detection/genotyping had ~95% sensitivity and ~100% specificity for substitution, insertion/deletion, splicing, and gross deletion mutations and single-nucleotide polymorphisms. In 104 unrelated DNA samples, the average genomic carrier burden for severe pediatric recessive mutations was 2.8 and ranged from 0 to 7. The distribution of mutations among sequenced samples appeared random. Twenty-seven percent of mutations cited in the literature were found to be common polymorphisms or misannotated, underscoring the need for better mutation databases as part of a comprehensive carrier testing strategy. Given the magnitude of carrier burden and the lower cost of testing compared to treating these conditions, carrier screening by NGS made available to the general population may be an economical way to reduce the incidence of and ameliorate suffering associated with severe recessive childhood disorders.


Assuntos
Genes Recessivos/genética , Triagem de Portadores Genéticos/métodos , Testes Genéticos/métodos , Análise de Sequência de DNA/métodos , Sequência de Bases , Criança , Bases de Dados Genéticas , Feminino , Testes Genéticos/economia , Genoma Humano , Heterozigoto , Humanos , Dados de Sequência Molecular , Mutação , Gravidez , Diagnóstico Pré-Natal , Alinhamento de Sequência , Análise de Sequência de DNA/economia
10.
PLoS One ; 5(2): e9320, 2010 Feb 22.
Artigo em Inglês | MEDLINE | ID: mdl-20179767

RESUMO

Methylation, the addition of methyl groups to cytosine (C), plays an important role in the regulation of gene expression in both normal and dysfunctional cells. During bisulfite conversion and subsequent PCR amplification, unmethylated Cs are converted into thymine (T), while methylated Cs will not be converted. Sequencing of this bisulfite-treated DNA permits the detection of methylation at specific sites. Through the introduction of next-generation sequencing technologies (NGS) simultaneous analysis of methylation motifs in multiple regions provides the opportunity for hypothesis-free study of the entire methylome. Here we present a whole methylome sequencing study that compares two different bisulfite conversion methods (in solution versus in gel), utilizing the high throughput of the SOLiD System. Advantages and disadvantages of the two different bisulfite conversion methods for constructing sequencing libraries are discussed. Furthermore, the application of the SOLiD bisulfite sequencing to larger and more complex genomes is shown with preliminary in silico created bisulfite converted reads.


Assuntos
Metilação de DNA , Genoma Humano/genética , Análise de Sequência de DNA/métodos , Sequência de Bases , Sítios de Ligação/genética , DNA/química , DNA/genética , Eletroforese em Gel de Poliacrilamida/métodos , Biblioteca Genômica , Humanos , Dados de Sequência Molecular , Reação em Cadeia da Polimerase , Homologia de Sequência do Ácido Nucleico , Sulfitos/química
11.
Genome Res ; 18(10): 1638-42, 2008 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-18775913

RESUMO

Forward genetic mutational studies, adaptive evolution, and phenotypic screening are powerful tools for creating new variant organisms with desirable traits. However, mutations generated in the process cannot be easily identified with traditional genetic tools. We show that new high-throughput, massively parallel sequencing technologies can completely and accurately characterize a mutant genome relative to a previously sequenced parental (reference) strain. We studied a mutant strain of Pichia stipitis, a yeast capable of converting xylose to ethanol. This unusually efficient mutant strain was developed through repeated rounds of chemical mutagenesis, strain selection, transformation, and genetic manipulation over a period of seven years. We resequenced this strain on three different sequencing platforms. Surprisingly, we found fewer than a dozen mutations in open reading frames. All three sequencing technologies were able to identify each single nucleotide mutation given at least 10-15-fold nominal sequence coverage. Our results show that detecting mutations in evolved and engineered organisms is rapid and cost-effective at the whole-genome level using new sequencing technologies. Identification of specific mutations in strains with altered phenotypes will add insight into specific gene functions and guide further metabolic engineering efforts.


Assuntos
Análise Mutacional de DNA/métodos , Genoma Fúngico , Mutação , Pichia/genética , Alinhamento de Sequência , Análise de Sequência de DNA
12.
Genome Res ; 17(8): 1170-7, 2007 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-17620451

RESUMO

Although histones can form nucleosomes on virtually any genomic sequence, DNA sequences show considerable variability in their binding affinity. We have used DNA sequences of Saccharomyces cerevisiae whose nucleosome binding affinities have been experimentally determined (Yuan et al. 2005) to train a support vector machine to identify the nucleosome formation potential of any given sequence of DNA. The DNA sequences whose nucleosome formation potential are most accurately predicted are those that contain strong nucleosome forming or inhibiting signals and are found within nucleosome length stretches of genomic DNA with continuous nucleosome formation or inhibition signals. We have accurately predicted the experimentally determined nucleosome positions across a well-characterized promoter region of S. cerevisiae and identified strong periodicity within 199 center-aligned mononucleosomes studied recently (Segal et al. 2006) despite there being no periodicity information used to train the support vector machine. Our analysis suggests that only a subset of nucleosomes are likely to be positioned by intrinsic sequence signals. This observation is consistent with the available experimental data and is inconsistent with the proposal of a nucleosome positioning code. Finally, we show that intrinsic nucleosome positioning signals are both more inhibitory and more variable in promoter regions than in open reading frames in S. cerevisiae.


Assuntos
DNA Fúngico/química , Genoma Fúngico , Nucleossomos/genética , DNA Fúngico/metabolismo , Cadeias de Markov , Nucleossomos/metabolismo , Regiões Promotoras Genéticas , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA