RESUMO
The Tasmanian devil (Sarcophilus harrisii), the largest marsupial carnivore, is endangered due to a transmissible facial cancer spread by direct transfer of living cancer cells through biting. Here we describe the sequencing, assembly, and annotation of the Tasmanian devil genome and whole-genome sequences for two geographically distant subclones of the cancer. Genomic analysis suggests that the cancer first arose from a female Tasmanian devil and that the clone has subsequently genetically diverged during its spread across Tasmania. The devil cancer genome contains more than 17,000 somatic base substitution mutations and bears the imprint of a distinct mutational process. Genotyping of somatic mutations in 104 geographically and temporally distributed Tasmanian devil tumors reveals the pattern of evolution and spread of this parasitic clonal lineage, with evidence of a selective sweep in one geographical area and persistence of parallel lineages in other populations.
Assuntos
Neoplasias Faciais/veterinária , Instabilidade Genômica , Marsupiais/genética , Mutação , Animais , Evolução Clonal , Espécies em Perigo de Extinção , Neoplasias Faciais/epidemiologia , Neoplasias Faciais/genética , Neoplasias Faciais/patologia , Feminino , Estudo de Associação Genômica Ampla , Masculino , Dados de Sequência Molecular , Tasmânia/epidemiologiaRESUMO
Improvement of variant calling in next-generation sequence data requires a comprehensive, genome-wide catalog of high-confidence variants called in a set of genomes for use as a benchmark. We generated deep, whole-genome sequence data of 17 individuals in a three-generation pedigree and called variants in each genome using a range of currently available algorithms. We used haplotype transmission information to create a phased "Platinum" variant catalog of 4.7 million single-nucleotide variants (SNVs) plus 0.7 million small (1-50 bp) insertions and deletions (indels) that are consistent with the pattern of inheritance in the parents and 11 children of this pedigree. Platinum genotypes are highly concordant with the current catalog of the National Institute of Standards and Technology for both SNVs (>99.99%) and indels (99.92%) and add a validated truth catalog that has 26% more SNVs and 45% more indels. Analysis of 334,652 SNVs that were consistent between informatics pipelines yet inconsistent with haplotype transmission ("nonplatinum") revealed that the majority of these variants are de novo and cell-line mutations or reside within previously unidentified duplications and deletions. The reference materials from this study are a resource for objective assessment of the accuracy of variant calls throughout genomes.
Assuntos
Genoma Humano/genética , Genômica , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Algoritmos , Bases de Dados Genéticas , Exoma/genética , Genótipo , Humanos , Mutação INDEL/genética , Linhagem , Polimorfismo de Nucleotídeo Único , SoftwareRESUMO
Identifying large expansions of short tandem repeats (STRs), such as those that cause amyotrophic lateral sclerosis (ALS) and fragile X syndrome, is challenging for short-read whole-genome sequencing (WGS) data. A solution to this problem is an important step toward integrating WGS into precision medicine. We developed a software tool called ExpansionHunter that, using PCR-free WGS short-read data, can genotype repeats at the locus of interest, even if the expanded repeat is larger than the read length. We applied our algorithm to WGS data from 3001 ALS patients who have been tested for the presence of the C9orf72 repeat expansion with repeat-primed PCR (RP-PCR). Compared against this truth data, ExpansionHunter correctly classified all (212/212, 95% CI [0.98, 1.00]) of the expanded samples as either expansions (208) or potential expansions (4). Additionally, 99.9% (2786/2789, 95% CI [0.997, 1.00]) of the wild-type samples were correctly classified as wild type by this method with the remaining three samples identified as possible expansions. We further applied our algorithm to a set of 152 samples in which every sample had one of eight different pathogenic repeat expansions, including those associated with fragile X syndrome, Friedreich's ataxia, and Huntington's disease, and correctly flagged all but one of the known repeat expansions. Thus, ExpansionHunter can be used to accurately detect known pathogenic repeat expansions and provides researchers with a tool that can be used to identify new pathogenic repeat expansions.
Assuntos
Esclerose Lateral Amiotrófica/genética , Expansão das Repetições de DNA , Sequenciamento Completo do Genoma/métodos , Algoritmos , Proteína C9orf72/genética , Bases de Dados Genéticas , Humanos , Medicina de Precisão , Sensibilidade e Especificidade , SoftwareRESUMO
Cancers acquire resistance to systemic treatment as a result of clonal evolution and selection. Repeat biopsies to study genomic evolution as a result of therapy are difficult, invasive and may be confounded by intra-tumour heterogeneity. Recent studies have shown that genomic alterations in solid cancers can be characterized by massively parallel sequencing of circulating cell-free tumour DNA released from cancer cells into plasma, representing a non-invasive liquid biopsy. Here we report sequencing of cancer exomes in serial plasma samples to track genomic evolution of metastatic cancers in response to therapy. Six patients with advanced breast, ovarian and lung cancers were followed over 1-2 years. For each case, exome sequencing was performed on 2-5 plasma samples (19 in total) spanning multiple courses of treatment, at selected time points when the allele fraction of tumour mutations in plasma was high, allowing improved sensitivity. For two cases, synchronous biopsies were also analysed, confirming genome-wide representation of the tumour genome in plasma. Quantification of allele fractions in plasma identified increased representation of mutant alleles in association with emergence of therapy resistance. These included an activating mutation in PIK3CA (phosphatidylinositol-4,5-bisphosphate 3-kinase, catalytic subunit alpha) following treatment with paclitaxel; a truncating mutation in RB1 (retinoblastoma 1) following treatment with cisplatin; a truncating mutation in MED1 (mediator complex subunit 1) following treatment with tamoxifen and trastuzumab, and following subsequent treatment with lapatinib, a splicing mutation in GAS6 (growth arrest-specific 6) in the same patient; and a resistance-conferring mutation in EGFR (epidermal growth factor receptor; T790M) following treatment with gefitinib. These results establish proof of principle that exome-wide analysis of circulating tumour DNA could complement current invasive biopsy approaches to identify mutations associated with acquired drug resistance in advanced cancers. Serial analysis of cancer genomes in plasma constitutes a new paradigm for the study of clonal evolution in human cancers.
Assuntos
Antineoplásicos/uso terapêutico , DNA de Neoplasias/análise , DNA de Neoplasias/genética , Resistencia a Medicamentos Antineoplásicos/genética , Neoplasias/tratamento farmacológico , Neoplasias/genética , Plasma/química , Alelos , Antineoplásicos/farmacologia , Neoplasias da Mama/tratamento farmacológico , Neoplasias da Mama/genética , Neoplasias da Mama/patologia , Carcinoma Pulmonar de Células não Pequenas/tratamento farmacológico , Carcinoma Pulmonar de Células não Pequenas/genética , Carcinoma Pulmonar de Células não Pequenas/patologia , Classe I de Fosfatidilinositol 3-Quinases , Análise Mutacional de DNA , Resistencia a Medicamentos Antineoplásicos/efeitos dos fármacos , Receptores ErbB/genética , Evolução Molecular , Exoma/genética , Feminino , Genoma Humano/genética , Genômica , Humanos , Peptídeos e Proteínas de Sinalização Intercelular/genética , Neoplasias Pulmonares/tratamento farmacológico , Neoplasias Pulmonares/genética , Neoplasias Pulmonares/patologia , Subunidade 1 do Complexo Mediador/genética , Neoplasias/patologia , Neoplasias Ovarianas/tratamento farmacológico , Neoplasias Ovarianas/genética , Neoplasias Ovarianas/patologia , Fosfatidilinositol 3-Quinases/genética , Proteína do Retinoblastoma/genéticaRESUMO
Zebrafish have become a popular organism for the study of vertebrate gene function. The virtually transparent embryos of this species, and the ability to accelerate genetic studies by gene knockdown or overexpression, have led to the widespread use of zebrafish in the detailed investigation of vertebrate gene function and increasingly, the study of human genetic disease. However, for effective modelling of human genetic disease it is important to understand the extent to which zebrafish genes and gene structures are related to orthologous human genes. To examine this, we generated a high-quality sequence assembly of the zebrafish genome, made up of an overlapping set of completely sequenced large-insert clones that were ordered and oriented using a high-resolution high-density meiotic map. Detailed automatic and manual annotation provides evidence of more than 26,000 protein-coding genes, the largest gene set of any vertebrate so far sequenced. Comparison to the human reference genome shows that approximately 70% of human genes have at least one obvious zebrafish orthologue. In addition, the high quality of this genome assembly provides a clearer understanding of key genomic features such as a unique repeat content, a scarcity of pseudogenes, an enrichment of zebrafish-specific genes on chromosome 4 and chromosomal regions that influence sex determination.
Assuntos
Sequência Conservada/genética , Genoma/genética , Peixe-Zebra/genética , Animais , Cromossomos/genética , Evolução Molecular , Feminino , Genes/genética , Genoma Humano/genética , Genômica , Humanos , Masculino , Meiose/genética , Anotação de Sequência Molecular , Pseudogenes/genética , Padrões de Referência , Processos de Determinação Sexual/genética , Proteínas de Peixe-Zebra/genéticaRESUMO
An X-linked condition characterized by the combination of hypomyelinating leukodystrophy and spondylometaphyseal dysplasia (H-SMD) has been observed in only four families, with linkage to Xq25-27, and recent genetic characterization in two families with a common AIFM1 mutation. In our study, 12 patients (6 families) with H-SMD were identified and underwent comprehensive assessment accompanied by whole-exome sequencing (WES). Pedigree analysis in all families was consistent with X-linked recessive inheritance. Presentation typically occurred between 12 and 36 months. In addition to the two disease-defining features of spondylometaphyseal dysplasia and hypomyelination on MRI, common clinical signs and symptoms included motor deterioration, spasticity, tremor, ataxia, dysarthria, cognitive defects, pulmonary hypertension, nystagmus, and vision loss due to retinopathy. The course of the disease was slowly progressive. All patients had maternally inherited or de novo mutations in or near exon 7 of AIFM1, within a region of 70 bp, including synonymous and intronic changes. AIFM1 mutations have previously been associated with neurologic presentations as varied as intellectual disability, hearing loss, neuropathy, and striatal necrosis, while AIFM1 mutations in this small region present with a distinct phenotype implicating bone. Analysis of cell lines derived from four patients identified significant reductions in AIFM1 mRNA and protein levels in osteoblasts. We hypothesize that AIFM1 functions in bone metabolism and myelination and is responsible for the unique phenotype in this condition.
Assuntos
Fator de Indução de Apoptose/genética , Genes Ligados ao Cromossomo X/genética , Predisposição Genética para Doença , Mutação/genética , Humanos , Deficiência Intelectual/genética , Masculino , Bainha de Mielina/genética , Bainha de Mielina/metabolismo , Osteocondrodisplasias/genética , Linhagem , Fenótipo , Análise de Sequência de DNARESUMO
To determine early somatic changes in high-grade serous ovarian cancer (HGSOC), we performed whole genome sequencing on a rare collection of 16 low stage HGSOCs. The majority showed extensive structural alterations (one had an ultramutated profile), exhibited high levels of p53 immunoreactivity, and harboured a TP53 mutation, deletion or inactivation. BRCA1 and BRCA2 mutations were observed in two tumors, with nine showing evidence of a homologous recombination (HR) defect. Combined Analysis with The Cancer Genome Atlas (TCGA) indicated that low and late stage HGSOCs have similar mutation and copy number profiles. We also found evidence that deleterious TP53 mutations are the earliest events, followed by deletions or loss of heterozygosity (LOH) of chromosomes carrying TP53, BRCA1 or BRCA2. Inactivation of HR appears to be an early event, as 62.5% of tumours showed a LOH pattern suggestive of HR defects. Three tumours with the highest ploidy had little genome-wide LOH, yet one of these had a homozygous somatic frame-shift BRCA2 mutation, suggesting that some carcinomas begin as tetraploid then descend into diploidy accompanied by genome-wide LOH. Lastly, we found evidence that structural variants (SV) cluster in HGSOC, but are absent in one ultramutated tumor, providing insights into the pathogenesis of low stage HGSOC.
Assuntos
Genes p53 , Mutação , Neoplasias Ovarianas/genética , Reparo de DNA por Recombinação , Tetraploidia , Carcinoma/genética , DNA Primase/genética , Feminino , Humanos , Perda de Heterozigosidade , Taxa de MutaçãoRESUMO
BACKGROUND: The management of metastatic breast cancer requires monitoring of the tumor burden to determine the response to treatment, and improved biomarkers are needed. Biomarkers such as cancer antigen 15-3 (CA 15-3) and circulating tumor cells have been widely studied. However, circulating cell-free DNA carrying tumor-specific alterations (circulating tumor DNA) has not been extensively investigated or compared with other circulating biomarkers in breast cancer. METHODS: We compared the radiographic imaging of tumors with the assay of circulating tumor DNA, CA 15-3, and circulating tumor cells in 30 women with metastatic breast cancer who were receiving systemic therapy. We used targeted or whole-genome sequencing to identify somatic genomic alterations and designed personalized assays to quantify circulating tumor DNA in serially collected plasma specimens. CA 15-3 levels and numbers of circulating tumor cells were measured at identical time points. RESULTS: Circulating tumor DNA was successfully detected in 29 of the 30 women (97%) in whom somatic genomic alterations were identified; CA 15-3 and circulating tumor cells were detected in 21 of 27 women (78%) and 26 of 30 women (87%), respectively. Circulating tumor DNA levels showed a greater dynamic range, and greater correlation with changes in tumor burden, than did CA 15-3 or circulating tumor cells. Among the measures tested, circulating tumor DNA provided the earliest measure of treatment response in 10 of 19 women (53%). CONCLUSIONS: This proof-of-concept analysis showed that circulating tumor DNA is an informative, inherently specific, and highly sensitive biomarker of metastatic breast cancer. (Funded by Cancer Research UK and others.).
Assuntos
Biomarcadores Tumorais/sangue , Neoplasias da Mama/secundário , DNA de Neoplasias/sangue , Mucina-1/sangue , Metástase Neoplásica/diagnóstico , Neoplasias da Mama/sangue , Neoplasias da Mama/genética , Feminino , Estudo de Associação Genômica Ampla , Humanos , Mutação , Metástase Neoplásica/diagnóstico por imagem , Metástase Neoplásica/genética , Prognóstico , Radiografia , Sensibilidade e Especificidade , Análise de Sequência de DNA/métodos , Carga TumoralRESUMO
All cancers carry somatic mutations. A subset of these somatic alterations, termed driver mutations, confer selective growth advantage and are implicated in cancer development, whereas the remainder are passengers. Here we have sequenced the genomes of a malignant melanoma and a lymphoblastoid cell line from the same person, providing the first comprehensive catalogue of somatic mutations from an individual cancer. The catalogue provides remarkable insights into the forces that have shaped this cancer genome. The dominant mutational signature reflects DNA damage due to ultraviolet light exposure, a known risk factor for malignant melanoma, whereas the uneven distribution of mutations across the genome, with a lower prevalence in gene footprints, indicates that DNA repair has been preferentially deployed towards transcribed regions. The results illustrate the power of a cancer genome sequence to reveal traces of the DNA damage, repair, mutation and selection processes that were operative years before the cancer became symptomatic.
Assuntos
Genes Neoplásicos/genética , Genoma Humano/genética , Mutação/genética , Neoplasias/genética , Adulto , Linhagem Celular Tumoral , Dano ao DNA/genética , Análise Mutacional de DNA , Reparo do DNA/genética , Dosagem de Genes/genética , Humanos , Perda de Heterozigosidade/genética , Masculino , Melanoma/etiologia , Melanoma/genética , MicroRNAs/genética , Mutagênese Insercional/genética , Neoplasias/etiologia , Polimorfismo de Nucleotídeo Único/genética , Medicina de Precisão , Deleção de Sequência/genética , Raios UltravioletaRESUMO
Chronic lymphocytic leukemia is characterized by relapse after treatment and chemotherapy resistance. Similarly, in other malignancies leukemia cells accumulate mutations during growth, forming heterogeneous cell populations that are subject to Darwinian selection and may respond differentially to treatment. There is therefore a clinical need to monitor changes in the subclonal composition of cancers during disease progression. Here, we use whole-genome sequencing to track subclonal heterogeneity in 3 chronic lymphocytic leukemia patients subjected to repeated cycles of therapy. We reveal different somatic mutation profiles in each patient and use these to establish probable hierarchical patterns of subclonal evolution, to identify subclones that decline or expand over time, and to detect founder mutations. We show that clonal evolution patterns are heterogeneous in individual patients. We conclude that genome sequencing is a powerful and sensitive approach to monitor disease progression repeatedly at the molecular level. If applied to future clinical trials, this approach might eventually influence treatment strategies as a tool to individualize and direct cancer treatment.
Assuntos
DNA de Neoplasias/genética , Estudo de Associação Genômica Ampla , Leucemia Linfocítica Crônica de Células B/genética , Mutação , Análise de Sequência de DNA , Alelos , Transformação Celular Neoplásica/genética , Deleção Clonal , Células Clonais , Análise Mutacional de DNA , Progressão da Doença , Evolução Molecular , Frequência do Gene , Humanos , Leucemia Linfocítica Crônica de Células B/tratamento farmacológico , Leucemia Linfocítica Crônica de Células B/patologia , Leucemia Linfocítica Crônica de Células B/fisiopatologia , Proteínas de Neoplasias/genética , Seleção GenéticaRESUMO
DNA sequence information underpins genetic research, enabling discoveries of important biological or medical benefit. Sequencing projects have traditionally used long (400-800 base pair) reads, but the existence of reference sequences for the human and many other genomes makes it possible to develop new, fast approaches to re-sequencing, whereby shorter reads are compared to a reference to identify intraspecies genetic variation. Here we report an approach that generates several billion bases of accurate nucleotide sequence per experiment at low cost. Single molecules of DNA are attached to a flat surface, amplified in situ and used as templates for synthetic sequencing with fluorescent reversible terminator deoxyribonucleotides. Images of the surface are analysed to generate high-quality sequence. We demonstrate application of this approach to human genome sequencing on flow-sorted X chromosomes and then scale the approach to determine the genome sequence of a male Yoruba from Ibadan, Nigeria. We build an accurate consensus sequence from >30x average depth of paired 35-base reads. We characterize four million single-nucleotide polymorphisms and four hundred thousand structural variants, many of which were previously unknown. Our approach is effective for accurate, rapid and economical whole-genome re-sequencing and many other biomedical applications.
Assuntos
Genoma Humano/genética , Genômica/métodos , Análise de Sequência de DNA/métodos , Cromossomos Humanos X/genética , Sequência Consenso/genética , Genômica/economia , Genótipo , Humanos , Masculino , Nigéria , Polimorfismo de Nucleotídeo Único/genética , Sensibilidade e Especificidade , Análise de Sequência de DNA/economiaRESUMO
Chromosome 17 is unusual among the human chromosomes in many respects. It is the largest human autosome with orthology to only a single mouse chromosome, mapping entirely to the distal half of mouse chromosome 11. Chromosome 17 is rich in protein-coding genes, having the second highest gene density in the genome. It is also enriched in segmental duplications, ranking third in density among the autosomes. Here we report a finished sequence for human chromosome 17, as well as a structural comparison with the finished sequence for mouse chromosome 11, the first finished mouse chromosome. Comparison of the orthologous regions reveals striking differences. In contrast to the typical pattern seen in mammalian evolution, the human sequence has undergone extensive intrachromosomal rearrangement, whereas the mouse sequence has been remarkably stable. Moreover, although the human sequence has a high density of segmental duplication, the mouse sequence has a very low density. Notably, these segmental duplications correspond closely to the sites of structural rearrangement, demonstrating a link between duplication and rearrangement. Examination of the main classes of duplicated segments provides insight into the dynamics underlying expansion of chromosome-specific, low-copy repeats in the human genome.
Assuntos
Cromossomos Humanos Par 17/genética , Evolução Molecular , Animais , Composição de Bases , Duplicação Gênica , Humanos , Elementos Nucleotídeos Longos e Dispersos/genética , Camundongos , Análise de Sequência de DNA , Elementos Nucleotídeos Curtos e Dispersos/genética , Sintenia/genéticaRESUMO
Gibbon species have accumulated an unusually high number of chromosomal changes since diverging from the common hominoid ancestor 15-18 million years ago. The cause of this increased rate of chromosomal rearrangements is not known, nor is it known if genome architecture has a role. To address this question, we analyzed sequences spanning 57 breaks of synteny between northern white-cheeked gibbons (Nomascus l. leucogenys) and humans. We find that the breakpoint regions are enriched in segmental duplications and repeats, with Alu elements being the most abundant. Alus located near the gibbon breakpoints (<150 bp) have a higher CpG content than other Alus. Bisulphite allelic sequencing reveals that these gibbon Alus have a lower average density of methylated cytosine that their human orthologues. The finding of higher CpG content and lower average CpG methylation suggests that the gibbon Alu elements are epigenetically distinct from their human orthologues. The association between undermethylation and chromosomal rearrangement in gibbons suggests a correlation between epigenetic state and structural genome variation in evolution.
Assuntos
Citosina/metabolismo , Metilação de DNA , Evolução Molecular , Hylobates/genética , Elementos Alu , Animais , Mapeamento Cromossômico , Quebras de DNA , Epigênese Genética , Rearranjo Gênico , Genoma Humano , Humanos , Hylobates/metabolismo , Cariotipagem , Modelos Genéticos , Especificidade da Espécie , SinteniaRESUMO
Modern genetic analysis requires the development of new resources to systematically explore gene function in vivo. Overexpression screens are a powerful method to investigate genetic pathways, but the goal of routine and comprehensive overexpression screens has been hampered by the lack of systematic libraries. Here we describe the construction of a systematic collection of the Saccharomyces cerevisiae genome in a high-copy vector and its validation in two overexpression screens.
Assuntos
Perfilação da Expressão Gênica/métodos , Biblioteca Gênica , Genoma Fúngico/genética , Saccharomyces cerevisiae/genética , Regulação Fúngica da Expressão Gênica , Genes Fúngicos/genéticaRESUMO
CpG islands (CGIs) are dense clusters of CpG sequences that punctuate the CpG-deficient human genome and associate with many gene promoters. As CGIs also differ from bulk chromosomal DNA by their frequent lack of cytosine methylation, we devised a CGI enrichment method based on nonmethylated CpG affinity chromatography. The resulting library was sequenced to define a novel human blood CGI set that includes many that are not detected by current algorithms. Approximately half of CGIs were associated with annotated gene transcription start sites, the remainder being intra- or intergenic. Using an array representing over 17,000 CGIs, we established that 6%-8% of CGIs are methylated in genomic DNA of human blood, brain, muscle, and spleen. Inter- and intragenic CGIs are preferentially susceptible to methylation. CGIs showing tissue-specific methylation were overrepresented at numerous genetic loci that are essential for development, including HOX and PAX family members. The findings enable a comprehensive analysis of the roles played by CGI methylation in normal and diseased human tissues.
Assuntos
Ilhas de CpG/fisiologia , Metilação de DNA , Regulação da Expressão Gênica no Desenvolvimento , Genes Controladores do Desenvolvimento , Mapeamento Cromossômico , Feminino , Biblioteca Gênica , Genoma Humano , Humanos , Masculino , Especificidade de Órgãos , Distribuição TecidualRESUMO
Aspergillus fumigatus is exceptional among microorganisms in being both a primary and opportunistic pathogen as well as a major allergen. Its conidia production is prolific, and so human respiratory tract exposure is almost constant. A. fumigatus is isolated from human habitats and vegetable compost heaps. In immunocompromised individuals, the incidence of invasive infection can be as high as 50% and the mortality rate is often about 50% (ref. 2). The interaction of A. fumigatus and other airborne fungi with the immune system is increasingly linked to severe asthma and sinusitis. Although the burden of invasive disease caused by A. fumigatus is substantial, the basic biology of the organism is mostly obscure. Here we show the complete 29.4-megabase genome sequence of the clinical isolate Af293, which consists of eight chromosomes containing 9,926 predicted genes. Microarray analysis revealed temperature-dependent expression of distinct sets of genes, as well as 700 A. fumigatus genes not present or significantly diverged in the closely related sexual species Neosartorya fischeri, many of which may have roles in the pathogenicity phenotype. The Af293 genome sequence provides an unparalleled resource for the future understanding of this remarkable fungus.
Assuntos
Alérgenos/genética , Aspergillus fumigatus/genética , Aspergillus fumigatus/patogenicidade , Genoma Fúngico , Genômica , Hipersensibilidade/microbiologia , Aspergillus fumigatus/imunologia , Perfilação da Expressão Gênica , Regulação Fúngica da Expressão Gênica , Genes Fúngicos/genética , Dados de Sequência Molecular , Análise de Sequência com Séries de Oligonucleotídeos , Análise de Sequência de DNA , Temperatura , Virulência/genéticaRESUMO
Non-obese diabetic (NOD) mice spontaneously develop type 1 diabetes (T1D) due to the progressive loss of insulin-secreting beta-cells by an autoimmune driven process. NOD mice represent a valuable tool for studying the genetics of T1D and for evaluating therapeutic interventions. Here we describe the development and characterization by end-sequencing of bacterial artificial chromosome (BAC) libraries derived from NOD/MrkTac (DIL NOD) and NOD/ShiLtJ (CHORI-29), two commonly used NOD substrains. The DIL NOD library is composed of 196,032 BACs and the CHORI-29 library is composed of 110,976 BACs. The average depth of genome coverage of the DIL NOD library, estimated from mapping the BAC end-sequences to the reference mouse genome sequence, was 7.1-fold across the autosomes and 6.6-fold across the X chromosome. Clones from this library have an average insert size of 150 kb and map to over 95.6% of the reference mouse genome assembly (NCBIm37), covering 98.8% of Ensembl mouse genes. By the same metric, the CHORI-29 library has an average depth over the autosomes of 5.0-fold and 2.8-fold coverage of the X chromosome, the reduced X chromosome coverage being due to the use of a male donor for this library. Clones from this library have an average insert size of 205 kb and map to 93.9% of the reference mouse genome assembly, covering 95.7% of Ensembl genes. We have identified and validated 191,841 single nucleotide polymorphisms (SNPs) for DIL NOD and 114,380 SNPs for CHORI-29. In total we generated 229,736,133 bp of sequence for the DIL NOD and 121,963,211 bp for the CHORI-29. These BAC libraries represent a powerful resource for functional studies, such as gene targeting in NOD embryonic stem (ES) cell lines, and for sequencing and mapping experiments.
Assuntos
Cromossomos Artificiais Bacterianos/genética , Genoma , Animais , Cromossomos Artificiais Bacterianos/metabolismo , DNA Complementar/metabolismo , Masculino , Camundongos , Camundongos Endogâmicos NOD , Camundongos Endogâmicos , Dados de Sequência Molecular , Análise de Sequência de DNARESUMO
To analyse the myogenic transcriptome and identify novel genes involved in muscle development in an in vivo context, we have constructed a muscle specific cDNA library from GFP-expressing myoblasts purified by fluorescent activated cell sorting of transgenic zebrafish embryos. We have generated 153,428 EST sequences from this library that have been clustered into consensi, mapped to the genome assembly Zv6 and analysed for protein homology. Expression analysis of a randomly picked sample of clones using whole mount in situ hybridisation, identified 30 genes that are expressed specifically within the myotome, one third of which represent novel sequences. These genes have been assigned to syn-expression groups. The sequencing of the myoblast enriched cDNA library has significantly increased the number of zebrafish ESTs, facilitating the prediction of new spliced transcripts in the genome assembly and providing a transcriptome of an in vivo myoblast cell.
Assuntos
Perfilação da Expressão Gênica/métodos , Biblioteca Gênica , Mioblastos/metabolismo , Análise de Sequência de DNA/métodos , Peixe-Zebra/genética , Animais , Animais Geneticamente Modificados , Embrião não Mamífero , Etiquetas de Sequências Expressas , Genômica/métodos , Proteínas de Fluorescência Verde/genética , Proteínas de Fluorescência Verde/metabolismo , Especificidade de Órgãos/genética , Peixe-Zebra/embriologia , Peixe-Zebra/metabolismoRESUMO
The neotropical butterflies Heliconius melpomene and H. erato are Müllerian mimics that display the same warningly colored wing patterns in local populations, yet pattern diversity between geographic regions. Linkage mapping has previously shown convergent red wing phenotypes in these species are controlled by loci on homologous chromosomes. Here, AFLP bulk segregant analysis using H. melpomene crosses identified genetic markers tightly linked to two red wing-patterning loci. These markers were used to screen a H. melpomene BAC library and a tile path was assembled spanning one locus completely and part of the second. Concurrently, a similar strategy was used to identify a BAC clone tightly linked to the locus controlling the mimetic red wing phenotypes in H. erato. A methionine rich storage protein (MRSP) gene was identified within this BAC clone, and comparative genetic mapping shows red wing color loci are in homologous regions of the genome of H. erato and H. melpomene. Subtle differences in these convergent phenotypes imply they evolved independently using somewhat different developmental routes, but are nonetheless regulated by the same switch locus. Genetic mapping of MRSP in a third related species, the "tiger" patterned H. numata, has no association with wing patterning and shows no evidence for genomic translocation of wing-patterning loci.
Assuntos
Adaptação Biológica/genética , Borboletas/fisiologia , Evolução Molecular , Genes de Insetos , Variação Genética , Asas de Animais/anatomia & histologia , Análise do Polimorfismo de Comprimento de Fragmentos Amplificados , Animais , Biomimética , Padronização Corporal , Mapeamento Cromossômico , Cromossomos Artificiais Bacterianos , Cruzamentos Genéticos , Deriva Genética , Ligação Genética , Marcadores Genéticos , Fenótipo , Comportamento Predatório , Seleção Genética , Asas de Animais/fisiologiaRESUMO
We studied whether similar developmental genetic mechanisms are involved in both convergent and divergent evolution. Mimetic insects are known for their diversity of patterns as well as their remarkable evolutionary convergence, and they have played an important role in controversies over the respective roles of selection and constraints in adaptive evolution. Here we contrast three butterfly species, all classic examples of Müllerian mimicry. We used a genetic linkage map to show that a locus, Yb, which controls the presence of a yellow band in geographic races of Heliconius melpomene, maps precisely to the same location as the locus Cr, which has very similar phenotypic effects in its co-mimic H. erato. Furthermore, the same genomic location acts as a "supergene", determining multiple sympatric morphs in a third species, H. numata. H. numata is a species with a very different phenotypic appearance, whose many forms mimic different unrelated ithomiine butterflies in the genus Melinaea. Other unlinked colour pattern loci map to a homologous linkage group in the co-mimics H. melpomene and H. erato, but they are not involved in mimetic polymorphism in H. numata. Hence, a single region from the multilocus colour pattern architecture of H. melpomene and H. erato appears to have gained control of the entire wing-pattern variability in H. numata, presumably as a result of selection for mimetic "supergene" polymorphism without intermediates. Although we cannot at this stage confirm the homology of the loci segregating in the three species, our results imply that a conserved yet relatively unconstrained mechanism underlying pattern switching can affect mimicry in radically different ways. We also show that adaptive evolution, both convergent and diversifying, can occur by the repeated involvement of the same genomic regions.