RESUMO
Cryptococcus neoformans is a pathogenic basidiomycetous yeast responsible for more than 600,000 deaths each year. It occurs as two serotypes (A and D) representing two varieties (i.e. grubii and neoformans, respectively). Here, we sequenced the genome and performed an RNA-Seq-based analysis of the C. neoformans var. grubii transcriptome structure. We determined the chromosomal locations, analyzed the sequence/structural features of the centromeres, and identified origins of replication. The genome was annotated based on automated and manual curation. More than 40,000 introns populating more than 99% of the expressed genes were identified. Although most of these introns are located in the coding DNA sequences (CDS), over 2,000 introns in the untranslated regions (UTRs) were also identified. Poly(A)-containing reads were employed to locate the polyadenylation sites of more than 80% of the genes. Examination of the sequences around these sites revealed a new poly(A)-site-associated motif (AUGHAH). In addition, 1,197 miscRNAs were identified. These miscRNAs can be spliced and/or polyadenylated, but do not appear to have obvious coding capacities. Finally, this genome sequence enabled a comparative analysis of strain H99 variants obtained after laboratory passage. The spectrum of mutations identified provides insights into the genetics underlying the micro-evolution of a laboratory strain, and identifies mutations involved in stress responses, mating efficiency, and virulence.
Assuntos
Cryptococcus neoformans/genética , Genoma Fúngico/genética , RNA Fúngico/genética , Transcriptoma/genética , Virulência/genética , Cromossomos Fúngicos/genética , DNA Fúngico/genética , Íntrons/genéticaRESUMO
BACKGROUND: Anaplastic thyroid carcinoma is the most undifferentiated form of thyroid cancer and one of the deadliest of all adult solid malignancies. Here we report the first genomic and transcriptomic profile of anaplastic thyroid cancer including those of several unique cell lines and outline novel potential drivers of malignancy and targets of therapy. METHODS: We describe whole genomic and transcriptomic profiles of 1 primary anaplastic thyroid tumor and 3 authenticated cell lines. Those profiles augmented by the transcriptomes of 4 additional and unique cell lines were compared to 58 pairs of papillary thyroid carcinoma and matched normal tissue transcriptomes from The Cancer Genome Atlas study. RESULTS: The most prevalent mutations were those of TP53 and BRAF; repeated alterations of the epigenetic machinery such as frame-shift deletions of HDAC10 and EP300, loss of SMARCA2 and fusions of MECP2, BCL11A and SS18 were observed. Sequence data displayed aneuploidy and large regions of copy loss and gain in all genomes. Common regions of gain were however evident encompassing chromosomes 5p and 20q. We found novel anaplastic gene fusions including MKRN1-BRAF, FGFR2-OGDH and SS18-SLC5A11, all expressed in-frame fusions involving a known proto-oncogene. Comparison of the anaplastic thyroid cancer expression datasets with the papillary thyroid cancer and normal thyroid tissue transcriptomes suggested several known drug targets such as FGFRs, VEGFRs, KIT and RET to have lower expression levels in anaplastic specimens compared with both papillary thyroid cancers and normal tissues, confirming the observed lack of response to therapies targeting these pathways. Further integrative data analysis identified the mTOR signaling pathway as a potential therapeutic target in this disease. CONCLUSIONS: Anaplastic thyroid carcinoma possessed heterogeneous and unique profiles revealing the significance of detailed molecular profiling of individual tumors and the treatment of each as a unique entity; the cell line sequence data promises to facilitate the more accurate and intentional drug screening studies for anaplastic thyroid cancer.
Assuntos
Carcinoma/genética , Perfilação da Expressão Gênica/métodos , Genômica/métodos , Carcinoma Anaplásico da Tireoide/genética , Neoplasias da Glândula Tireoide/genética , Carcinoma/tratamento farmacológico , Carcinoma Papilar , Linhagem Celular Tumoral , Epigênese Genética , Regulação Neoplásica da Expressão Gênica , Heterogeneidade Genética , Variação Genética , Humanos , Masculino , Pessoa de Meia-Idade , Terapia de Alvo Molecular , Mutação , Proto-Oncogene Mas , Análise de Sequência de DNA , Câncer Papilífero da Tireoide , Carcinoma Anaplásico da Tireoide/tratamento farmacológico , Neoplasias da Glândula Tireoide/tratamento farmacológicoRESUMO
Extraordinary advancements in sequencing technology have made what was once a decade-long multi-institutional endeavor into a methodology with the potential for practical use in a clinical setting. We therefore set out to examine the clinical value of next-generation sequencing by enrolling patients with incurable or ambiguous tumors into the Personalized OncoGenomics initiative at the British Columbia Cancer Agency whereby whole genome and transcriptome analyses of tumor/normal tissue pairs are completed with the ultimate goal of directing therapeutics. First, we established that the sequencing, analysis, and communication with oncologists could be completed in less than 5 weeks. Second, we found that cancer diagnostics is an area that can greatly benefit from the comprehensiveness of a whole genome analysis. Here, we present a scenario in which a metastasized sphenoid mass, which was initially thought of as an undifferentiated squamous cell carcinoma, was rediagnosed as an SMARCB1-negative rhabdoid tumor based on the newly acquired finding of homozygous SMARCB1 deletion. The new diagnosis led to a change in chemotherapy and a complete nodal response in the patient. This study also provides additional insight into the mutational landscape of an adult SMARCB1-negative tumor that has not been explored at a whole genome and transcriptome level.
Assuntos
Carcinoma de Células Escamosas/genética , Proteínas Cromossômicas não Histona/genética , Proteínas de Ligação a DNA/genética , Sequenciamento de Nucleotídeos em Larga Escala , Tumor Rabdoide/genética , Fatores de Transcrição/genética , Adulto , Biomarcadores Tumorais/genética , Carcinoma de Células Escamosas/tratamento farmacológico , Carcinoma de Células Escamosas/enzimologia , Análise Mutacional de DNA , Perfilação da Expressão Gênica , Genoma Humano , Humanos , Masculino , Tumor Rabdoide/tratamento farmacológico , Tumor Rabdoide/patologia , Proteína SMARCB1RESUMO
Rust fungi are some of the most devastating pathogens of crop plants. They are obligate biotrophs, which extract nutrients only from living plant tissues and cannot grow apart from their hosts. Their lifestyle has slowed the dissection of molecular mechanisms underlying host invasion and avoidance or suppression of plant innate immunity. We sequenced the 101-Mb genome of Melampsora larici-populina, the causal agent of poplar leaf rust, and the 89-Mb genome of Puccinia graminis f. sp. tritici, the causal agent of wheat and barley stem rust. We then compared the 16,399 predicted proteins of M. larici-populina with the 17,773 predicted proteins of P. graminis f. sp tritici. Genomic features related to their obligate biotrophic lifestyle include expanded lineage-specific gene families, a large repertoire of effector-like small secreted proteins, impaired nitrogen and sulfur assimilation pathways, and expanded families of amino acid and oligopeptide membrane transporters. The dramatic up-regulation of transcripts coding for small secreted proteins, secreted hydrolytic enzymes, and transporters in planta suggests that they play a role in host infection and nutrient acquisition. Some of these genomic hallmarks are mirrored in the genomes of other microbial eukaryotes that have independently evolved to infect plants, indicating convergent adaptation to a biotrophic existence inside plant cells.
Assuntos
Basidiomycota/genética , Fungos/genética , Triticum/microbiologia , Perfilação da Expressão Gênica , Genes Fúngicos , Genoma , Genoma Fúngico , Modelos Genéticos , Nitratos/química , Análise de Sequência com Séries de Oligonucleotídeos , Filogenia , Doenças das Plantas/microbiologia , Folhas de Planta/microbiologia , Análise de Sequência de DNA , Sulfatos/químicaRESUMO
We report a high-quality draft of the genome sequence of the grey, short-tailed opossum (Monodelphis domestica). As the first metatherian ('marsupial') species to be sequenced, the opossum provides a unique perspective on the organization and evolution of mammalian genomes. Distinctive features of the opossum chromosomes provide support for recent theories about genome evolution and function, including a strong influence of biased gene conversion on nucleotide sequence composition, and a relationship between chromosomal characteristics and X chromosome inactivation. Comparison of opossum and eutherian genomes also reveals a sharp difference in evolutionary innovation between protein-coding and non-coding functional elements. True innovation in protein-coding genes seems to be relatively rare, with lineage-specific differences being largely due to diversification and rapid turnover in gene families involved in environmental interactions. In contrast, about 20% of eutherian conserved non-coding elements (CNEs) are recent inventions that postdate the divergence of Eutheria and Metatheria. A substantial proportion of these eutherian-specific CNEs arose from sequence inserted by transposable elements, pointing to transposons as a major creative force in the evolution of mammalian gene regulation.
Assuntos
Evolução Molecular , Genoma/genética , Genômica , Gambás/genética , Animais , Composição de Bases , Sequência Conservada/genética , Elementos de DNA Transponíveis/genética , Humanos , Polimorfismo de Nucleotídeo Único/genética , Biossíntese de Proteínas , Sintenia/genética , Inativação do Cromossomo X/genéticaRESUMO
MOTIVATION: Whole transcriptome shotgun sequencing data from non-normalized samples offer unique opportunities to study the metabolic states of organisms. One can deduce gene expression levels using sequence coverage as a surrogate, identify coding changes or discover novel isoforms or transcripts. Especially for discovery of novel events, de novo assembly of transcriptomes is desirable. RESULTS: Transcriptome from tumor tissue of a patient with follicular lymphoma was sequenced with 36 base pair (bp) single- and paired-end reads on the Illumina Genome Analyzer II platform. We assembled approximately 194 million reads using ABySS into 66 921 contigs 100 bp or longer, with a maximum contig length of 10 951 bp, representing over 30 million base pairs of unique transcriptome sequence, or roughly 1% of the genome. AVAILABILITY AND IMPLEMENTATION: Source code and binaries of ABySS are freely available for download at http://www.bcgsc.ca/platform/bioinfo/software/abyss. Assembler tool is implemented in C++. The parallel version uses Open MPI. ABySS-Explorer tool is implemented in Java using the Java universal network/graph framework. CONTACT: ibirol@bcgsc.ca.
Assuntos
Biologia Computacional/métodos , Perfilação da Expressão Gênica , Software , Bases de Dados Genéticas , Genoma , Análise de Sequência de DNARESUMO
We report on a 6-year-old boy referred for cytogenetics study. A few non-specific features were observed in the newborn: hypotonia, failure to thrive, seizures, pre-auricular skin tags. Cat-like cry was not identified. No remarkable facial dysmorphism, gastrointestinal, respiratory or cardiac abnormalities were identified. At age 4 years, speech and motor skill delays were apparent. Karyotyping and FISH analysis revealed a de novo rearranged chromosome 5p, with subtelomeric deletion of 5p and a duplication of the cri-du-chat critical region. Array CGH using sub-megabase resolution tiling-set (SMRT) array followed by FISH analysis with labeled BACs showed a deletion of 5pter to 5p15.31 (0-6.9 Mb) and an inverted duplication of the greater part of 5p15.31 to the distal end of 5p14.3 (6.9-19.9 Mb). Although very rare, inverted duplications with terminal deletion (inv dup del) have been reported at different chromosomal ends. Our finding adds a second patient of inv dup del 5p to this growing list, and the potential causative mechanisms for this rearrangement are discussed. Review of the mapping information of cri-du-chat patients and the comparison with a previously reported patient suggested that the critical region for cat-like cry is located within a 0.6 Mb region.
Assuntos
Aberrações Cromossômicas , Deleção Cromossômica , Inversão Cromossômica , Cromossomos Humanos Par 5/genética , Criança , Cromossomos Artificiais Bacterianos , Anormalidades Craniofaciais/genética , Síndrome de Cri-du-Chat/genética , Deficiências do Desenvolvimento/genética , Genótipo , Humanos , Hibridização in Situ Fluorescente , Cariotipagem , Masculino , FenótipoRESUMO
BACKGROUND: Prostate cancer is the most frequently diagnosed cancer in American men, and few effective treatment options are available to patients who develop hormone-refractory prostate cancer. The molecular changes that occur to allow prostate cells to proliferate in the absence of androgens are not fully understood. RESULTS: Subtractive hybridization experiments performed with samples from an in vivo model of hormonal progression identified 25 expressed sequences representing novel human transcripts. Intriguingly, these 25 sequences have small open-reading frames and are not highly conserved through evolution, suggesting many of these novel expressed sequences may be derived from untranslated regions of novel transcripts or from non-coding transcripts. Examination of a large metalibrary of human Serial Analysis of Gene Expression (SAGE) tags demonstrated that only three of these novel sequences had been previously detected. RT-PCR experiments confirmed that the 6 sequences tested were expressed in specific human tissues, as well as in clinical samples of prostate cancer. Further RT-PCR experiments for five of these fragments indicated they originated from large untranslated regions of unannotated transcripts. CONCLUSION: This study underlines the value of using complementary techniques in the annotation of the human genome. The tissue-specific expression of 4 of the 6 clones tested indicates the expression of these novel transcripts is tightly regulated, and future work will determine the possible role(s) these novel transcripts may play in the progression of prostate cancer.
Assuntos
Resistencia a Medicamentos Antineoplásicos/genética , Genes Neoplásicos , Neoplasias da Próstata/metabolismo , Androgênios/farmacologia , Animais , Regulação Neoplásica da Expressão Gênica , Biblioteca Gênica , Humanos , Masculino , Camundongos , Camundongos Endogâmicos BALB C , Camundongos Nus , Dados de Sequência Molecular , Neoplasias da Próstata/genética , RNA Mensageiro/metabolismo , Análise de Sequência de DNA , Células Tumorais Cultivadas , Regiões não TraduzidasRESUMO
The soil nematodes Caenorhabditis briggsae and Caenorhabditis elegans diverged from a common ancestor roughly 100 million years ago and yet are almost indistinguishable by eye. They have the same chromosome number and genome sizes, and they occupy the same ecological niche. To explore the basis for this striking conservation of structure and function, we have sequenced the C. briggsae genome to a high-quality draft stage and compared it to the finished C. elegans sequence. We predict approximately 19,500 protein-coding genes in the C. briggsae genome, roughly the same as in C. elegans. Of these, 12,200 have clear C. elegans orthologs, a further 6,500 have one or more clearly detectable C. elegans homologs, and approximately 800 C. briggsae genes have no detectable matches in C. elegans. Almost all of the noncoding RNAs (ncRNAs) known are shared between the two species. The two genomes exhibit extensive colinearity, and the rate of divergence appears to be higher in the chromosomal arms than in the centers. Operons, a distinctive feature of C. elegans, are highly conserved in C. briggsae, with the arrangement of genes being preserved in 96% of cases. The difference in size between the C. briggsae (estimated at approximately 104 Mbp) and C. elegans (100.3 Mbp) genomes is almost entirely due to repetitive sequence, which accounts for 22.4% of the C. briggsae genome in contrast to 16.5% of the C. elegans genome. Few, if any, repeat families are shared, suggesting that most were acquired after the two species diverged or are undergoing rapid evolution. Coclustering the C. elegans and C. briggsae proteins reveals 2,169 protein families of two or more members. Most of these are shared between the two species, but some appear to be expanding or contracting, and there seem to be as many as several hundred novel C. briggsae gene families. The C. briggsae draft sequence will greatly improve the annotation of the C. elegans genome. Based on similarity to C. briggsae, we found strong evidence for 1,300 new C. elegans genes. In addition, comparisons of the two genomes will help to understand the evolutionary forces that mold nematode genomes.
Assuntos
Caenorhabditis elegans/genética , Caenorhabditis/genética , Genoma , Genômica/métodos , Animais , Evolução Biológica , Mapeamento Cromossômico , Cromossomos Artificiais Bacterianos , Análise por Conglomerados , Códon , Sequência Conservada , Evolução Molecular , Éxons , Biblioteca Gênica , Sequências Repetitivas Dispersas , Íntrons , MicroRNAs/genética , Modelos Genéticos , Modelos Estatísticos , Dados de Sequência Molecular , Família Multigênica , Fases de Leitura Aberta , Mapeamento Físico do Cromossomo , Plasmídeos/metabolismo , Estrutura Terciária de Proteína , Proteínas/química , RNA/química , RNA Ribossômico/genética , RNA Líder para Processamento , RNA de Transferência/genética , Análise de Sequência de DNA , Especificidade da EspécieRESUMO
Purpose: Recent studies have identified mutation signatures of homologous recombination deficiency (HRD) in over 20% of breast cancers, as well as pancreatic, ovarian, and gastric cancers. There is an urgent need to understand the clinical implications of HRD signatures. Whereas BRCA1/2 mutations confer sensitivity to platinum-based chemotherapies, it is not yet clear whether mutation signatures can independently predict platinum response.Experimental Design: In this observational study, we sequenced tumor whole genomes (100× depth) and matched normals (60×) of 93 advanced-stage breast cancers (33 platinum-treated). We computed a published metric called HRDetect, independently trained to predict BRCA1/2 status, and assessed its capacity to predict outcomes on platinum-based chemotherapies. Clinical endpoints were overall survival (OS), total duration on platinum-based therapy (TDT), and radiographic evidence of clinical improvement (CI).Results: HRDetect predicted BRCA1/2 status with an area under the curve (AUC) of 0.94 and optimal threshold of 0.7. Elevated HRDetect was also significantly associated with CI on platinum-based therapy (AUC = 0.89; P = 0.006) with the same optimal threshold, even after adjusting for BRCA1/2 mutation status and treatment timing. HRDetect scores over 0.7 were associated with a 3-month extended median TDT (P = 0.0003) and 1.3-year extended median OS (P = 0.04).Conclusions: Our findings not only independently validate HRDetect, but also provide the first evidence of its association with platinum response in advanced breast cancer. We demonstrate that HRD mutation signatures may offer clinically relevant information independently of BRCA1/2 mutation status and hope this work will guide the development of clinical trials. Clin Cancer Res; 23(24); 7521-30. ©2017 AACR.
Assuntos
Proteína BRCA1/genética , Proteína BRCA2/genética , Recombinação Homóloga/genética , Neoplasias de Mama Triplo Negativas/genética , Intervalo Livre de Doença , Feminino , Recombinação Homóloga/efeitos dos fármacos , Humanos , Pessoa de Meia-Idade , Mutação , Estadiamento de Neoplasias , Platina/administração & dosagem , Resultado do Tratamento , Neoplasias de Mama Triplo Negativas/tratamento farmacológico , Neoplasias de Mama Triplo Negativas/patologia , Sequenciamento Completo do GenomaRESUMO
Spatial heterogeneity of transcriptional and genetic markers between physically isolated biopsies of a single tumor poses major barriers to the identification of biomarkers and the development of targeted therapies that will be effective against the entire tumor. We analyzed the spatial heterogeneity of multiregional biopsies from 35 patients, using a combination of transcriptomic and genomic profiles. Medulloblastomas (MBs), but not high-grade gliomas (HGGs), demonstrated spatially homogeneous transcriptomes, which allowed for accurate subgrouping of tumors from a single biopsy. Conversely, somatic mutations that affect genes suitable for targeted therapeutics demonstrated high levels of spatial heterogeneity in MB, malignant glioma, and renal cell carcinoma (RCC). Actionable targets found in a single MB biopsy were seldom clonal across the entire tumor, which brings the efficacy of monotherapies against a single target into question. Clinical trials of targeted therapies for MB should first ensure the spatially ubiquitous nature of the target mutation.
Assuntos
Neoplasias Cerebelares/genética , Regulação Neoplásica da Expressão Gênica , Meduloblastoma/genética , Transcriptoma , Adulto , Idoso , Idoso de 80 Anos ou mais , Neoplasias Cerebelares/patologia , Criança , Pré-Escolar , Análise por Conglomerados , Variações do Número de Cópias de DNA , Feminino , Perfilação da Expressão Gênica/métodos , Heterogeneidade Genética , Estudo de Associação Genômica Ampla , Humanos , Mutação INDEL , Masculino , Meduloblastoma/patologia , Pessoa de Meia-Idade , Mutação , Polimorfismo de Nucleotídeo Único , Análise de Componente Principal , Reação em Cadeia da Polimerase Via Transcriptase ReversaRESUMO
We describe an efficient high-throughput method for accurate DNA sequencing of entire cDNA clones. Developed as part of our involvement in the Mammalian Gene Collection full-length cDNA sequencing initiative, the method has been used and refined in our laboratory since September 2000. Amenable to large scale projects, we have used the method to generate >7 Mb of accurate sequence from 3695 candidate full-length cDNAs. Sequencing is accomplished through the insertion of Mu transposon into cDNAs, followed by sequencing reactions primed with Mu-specific sequencing primers. Transposon insertion reactions are not performed with individual cDNAs but rather on pools of up to 96 clones. This pooling strategy reduces the number of transposon insertion sequencing libraries that would otherwise be required, reducing the costs and enhancing the efficiency of the transposon library construction procedure. Sequences generated using transposon-specific sequencing primers are assembled to yield the full-length cDNA sequence, with sequence editing and other sequence finishing activities performed as required to resolve sequence ambiguities. Although analysis of the many thousands (22 785) of sequenced Mu transposon insertion events revealed a weak sequence preference for Mu insertion, we observed insertion of the Mu transposon into 1015 of the possible 1024 5mer candidate insertion sites.
Assuntos
Bacteriófago mu/genética , Elementos de DNA Transponíveis/genética , DNA Complementar/genética , Mutagênese Insercional/genética , Recombinação Genética/genética , Análise de Sequência de DNA/métodos , Composição de Bases , Clonagem Molecular , Primers do DNA/genética , Biblioteca Gênica , Vetores Genéticos/genética , Método de Monte Carlo , Mapeamento Físico do Cromossomo/métodos , Sensibilidade e Especificidade , Análise de Sequência de DNA/economia , Especificidade por Substrato , Fatores de TempoRESUMO
Medullary thyroid cancer (MTC) is a malignancy of the calcitonin-producing parafollicular cells of the thyroid gland. Surgery is the only curative treatment for this cancer. External beam radiation therapy is reserved for adjuvant treatment of MTC with aggressive features. Targeted therapeutics vandetanib and cabozantinib are approved for the treatment of aggressive and metastatic tumors that are not amenable to surgery. The use of these multikinase inhibitors are supported by the observed overactivation of the RET oncoprotein in a large subpopulation of MTCs. However, not all patients carry oncogenic alterations of this kinase. Hence, there is still a need for comprehensive molecular characterization of MTC utilizing whole-genome and transcriptome-sequencing methodologies with the aim of identifying targetable mutations. Here, we describe the genomic profiles of two medullary thyroid cancers and report the presence of a putative oncogenic BRAF fusion in one. Such alterations, previously observed in other malignancies and known targets of available drugs, can benefit patients who currently have no treatment options.
RESUMO
In an attempt to assess potential treatment options, whole-genome and transcriptome sequencing were performed on a patient with an unclassifiable small lymphoproliferative disorder. Variants from genome sequencing were prioritized using a combination of comparative variant distributions in a spectrum of lymphomas, and meta-analyses of gene expression profiling. In this patient, the molecular variants that we believe to be most relevant to the disease presentation most strongly resemble a diffuse large B-cell lymphoma (DLBCL), whereas the gene expression data are most consistent with a low-grade chronic lymphocytic leukemia (CLL). The variant of greatest interest was a predicted NOTCH2-truncating mutation, which has been recently reported in various lymphomas.
RESUMO
CONTEXT AND OBJECTIVE: Oncocytic thyroid carcinoma, also known as Hürthle cell thyroid carcinoma, accounts for only a small percentage of all thyroid cancers. However, this malignancy often presents at an advanced stage and poses unique challenges to patients and clinicians. Surgical resection of the tumor accompanied in some cases by radioactive iodine treatment, radiation, and chemotherapy are the established modes of therapy. Knowledge of the perturbed oncogenic pathways can provide better understanding of the mechanism of disease and thus opportunities for more effective clinical management. DESIGN AND PATIENTS: Initially, two oncocytic thyroid carcinomas and their matched normal tissues were profiled using whole genome sequencing. Subsequently, 72 oncocytic thyroid carcinomas, one cell line, and five Hürthle cell adenomas were examined by targeted sequencing for the presence of mutations in the multiple endocrine neoplasia I (MEN1) gene. RESULTS: Here we report the identification of MEN1 loss-of-function mutations in 4% of patients diagnosed with oncocytic thyroid carcinoma. Whole genome sequence data also revealed large regions of copy number variation encompassing nearly the entire genomes of these tumors. CONCLUSION: Menin, a ubiquitously expressed nuclear protein, is a well-characterized tumor suppressor whose loss is the cause of MEN1 syndrome. Menin is involved in several major cellular pathways such as regulation of transcription, control of cell cycle, apoptosis, and DNA damage repair pathways. Mutations of this gene in a subset of Hürthle cell tumors point to a potential role for this protein and its associated pathways in thyroid tumorigenesis.
Assuntos
Mutação , Proteínas Proto-Oncogênicas/genética , Neoplasias da Glândula Tireoide/genética , Adenoma Oxífilo , Linhagem Celular Tumoral , Transformação Celular Neoplásica/genética , Estudos de Coortes , Análise Mutacional de DNA , Dosagem de Genes , Humanos , Metástase Linfática , Análise por Pareamento , Glândula Tireoide/patologia , Neoplasias da Glândula Tireoide/patologiaRESUMO
Widespread adoption of massively parallel deoxyribonucleic acid (DNA) sequencing instruments has prompted the recent development of de novo short read assembly algorithms. A common shortcoming of the available tools is their inability to efficiently assemble vast amounts of data generated from large-scale sequencing projects, such as the sequencing of individual human genomes to catalog natural genetic variation. To address this limitation, we developed ABySS (Assembly By Short Sequences), a parallelized sequence assembler. As a demonstration of the capability of our software, we assembled 3.5 billion paired-end reads from the genome of an African male publicly released by Illumina, Inc. Approximately 2.76 million contigs > or =100 base pairs (bp) in length were created with an N50 size of 1499 bp, representing 68% of the reference human genome. Analysis of these contigs identified polymorphic and novel sequences not present in the human reference assembly, which were validated by alignment to alternate human assemblies and to other primate genomes.
Assuntos
Algoritmos , Biologia Computacional/métodos , Software , Animais , Mapeamento de Sequências Contíguas , Escherichia coli K12/genética , Variação Genética , Genoma Humano , Humanos , Polimorfismo Genético , Reprodutibilidade dos Testes , Análise de Sequência de DNA/métodosRESUMO
Restriction digest fingerprinting is a common method for characterizing large insert genomic clones, e.g., bacterial artificial chromosome (BAC), P1 artificial chromosome (PAC) and Fosmid clones. This clone fingerprinting method has been widely applied in the construction of clone-based physical maps, which have been used as positional cloning resources as well as to support directed and genome-wide sequencing efforts. This unit describes a robust, large-scale procedure for generation of agarose gel-based clone fingerprints from BAC clones.
Assuntos
Cromossomos Artificiais Bacterianos , Impressões Digitais de DNA/métodos , Enzimas de Restrição do DNA , Eletroforese em Gel de Ágar , MétodosRESUMO
As part of a larger project to sequence the Populus genome and generate genomic resources for this emerging model tree, we constructed a physical map of the Populus genome, representing one of the few such maps of an undomesticated, highly heterozygous plant species. The physical map, consisting of 2802 contigs, was constructed from fingerprinted bacterial artificial chromosome (BAC) clones. The map represents approximately 9.4-fold coverage of the Populus genome, which has been estimated from the genome sequence assembly to be 485 +/- 10 Mb in size. BAC ends were sequenced to assist long-range assembly of whole-genome shotgun sequence scaffolds and to anchor the physical map to the genome sequence. Simple sequence repeat-based markers were derived from the end sequences and used to initiate integration of the BAC and genetic maps. A total of 2411 physical map contigs, representing 97% of all clones assigned to contigs, were aligned to the sequence assembly (JGI Populus trichocarpa, version 1.0). These alignments represent a total coverage of 384 Mb (79%) of the entire poplar sequence assembly and 295 Mb (96%) of linkage group sequence assemblies. A striking result of the physical map contig alignments to the sequence assembly was the co-localization of multiple contigs across numerous regions of the 19 linkage groups. Targeted sequencing of BAC clones and genetic analysis in a small number of representative regions showed that these co-aligning contigs represent distinct haplotypes in the heterozygous individual sequenced, and revealed the nature of these haplotype sequence differences.
Assuntos
Genoma de Planta , Mapeamento Físico do Cromossomo , Populus/genética , Cromossomos Artificiais Bacterianos , Haplótipos , Repetições Minissatélites , Polimorfismo Genético , Alinhamento de Sequência , Análise de Sequência de DNARESUMO
BACKGROUND: Cattle are important agriculturally and relevant as a model organism. Previously described genetic and radiation hybrid (RH) maps of the bovine genome have been used to identify genomic regions and genes affecting specific traits. Application of these maps to identify influential genetic polymorphisms will be enhanced by integration with each other and with bacterial artificial chromosome (BAC) libraries. The BAC libraries and clone maps are essential for the hybrid clone-by-clone/whole-genome shotgun sequencing approach taken by the bovine genome sequencing project. RESULTS: A bovine BAC map was constructed with HindIII restriction digest fragments of 290,797 BAC clones from animals of three different breeds. Comparative mapping of 422,522 BAC end sequences assisted with BAC map ordering and assembly. Genotypes and pedigree from two genetic maps and marker scores from three whole-genome RH panels were consolidated on a 17,254-marker composite map. Sequence similarity allowed integrating the BAC and composite maps with the bovine draft assembly (Btau3.1), establishing a comprehensive resource describing the bovine genome. Agreement between the marker and BAC maps and the draft assembly is high, although discrepancies exist. The composite and BAC maps are more similar than either is to the draft assembly. CONCLUSION: Further refinement of the maps and greater integration into the genome assembly process may contribute to a high quality assembly. The maps provide resources to associate phenotypic variation with underlying genomic variation, and are crucial resources for understanding the biology underpinning this important ruminant species so closely associated with humans.