RESUMO
The genetic structure of the indigenous hunter-gatherer peoples of southern Africa, the oldest known lineage of modern human, is important for understanding human diversity. Studies based on mitochondrial and small sets of nuclear markers have shown that these hunter-gatherers, known as Khoisan, San, or Bushmen, are genetically divergent from other humans. However, until now, fully sequenced human genomes have been limited to recently diverged populations. Here we present the complete genome sequences of an indigenous hunter-gatherer from the Kalahari Desert and a Bantu from southern Africa, as well as protein-coding regions from an additional three hunter-gatherers from disparate regions of the Kalahari. We characterize the extent of whole-genome and exome diversity among the five men, reporting 1.3 million novel DNA differences genome-wide, including 13,146 novel amino acid variants. In terms of nucleotide substitutions, the Bushmen seem to be, on average, more different from each other than, for example, a European and an Asian. Observed genomic differences between the hunter-gatherers and others may help to pinpoint genetic adaptations to an agricultural lifestyle. Adding the described variants to current databases will facilitate inclusion of southern Africans in medical research efforts, particularly when family and medical histories can be correlated with genome-wide data.
Assuntos
População Negra/genética , Etnicidade/genética , Genoma Humano/genética , Povo Asiático/genética , Éxons/genética , Genética Médica , Humanos , Filogenia , Polimorfismo de Nucleotídeo Único/genética , África do Sul/etnologia , População Branca/genéticaRESUMO
Leaf-cutter ants are one of the most important herbivorous insects in the Neotropics, harvesting vast quantities of fresh leaf material. The ants use leaves to cultivate a fungus that serves as the colony's primary food source. This obligate ant-fungus mutualism is one of the few occurrences of farming by non-humans and likely facilitated the formation of their massive colonies. Mature leaf-cutter ant colonies contain millions of workers ranging in size from small garden tenders to large soldiers, resulting in one of the most complex polymorphic caste systems within ants. To begin uncovering the genomic underpinnings of this system, we sequenced the genome of Atta cephalotes using 454 pyrosequencing. One prediction from this ant's lifestyle is that it has undergone genetic modifications that reflect its obligate dependence on the fungus for nutrients. Analysis of this genome sequence is consistent with this hypothesis, as we find evidence for reductions in genes related to nutrient acquisition. These include extensive reductions in serine proteases (which are likely unnecessary because proteolysis is not a primary mechanism used to process nutrients obtained from the fungus), a loss of genes involved in arginine biosynthesis (suggesting that this amino acid is obtained from the fungus), and the absence of a hexamerin (which sequesters amino acids during larval development in other insects). Following recent reports of genome sequences from other insects that engage in symbioses with beneficial microbes, the A. cephalotes genome provides new insights into the symbiotic lifestyle of this ant and advances our understanding of host-microbe symbioses.
Assuntos
Formigas/fisiologia , Genoma de Inseto/genética , Folhas de Planta/fisiologia , Simbiose , Animais , Formigas/genética , Arginina/genética , Arginina/metabolismo , Sequência de Bases , Fungos/genética , Proteínas de Insetos/genética , Proteínas de Insetos/metabolismo , Análise de Sequência de DNA , Serina Proteases/genética , Serina Proteases/metabolismoRESUMO
Killer whales (Orcinus orca) currently comprise a single, cosmopolitan species with a diverse diet. However, studies over the last 30 yr have revealed populations of sympatric "ecotypes" with discrete prey preferences, morphology, and behaviors. Although these ecotypes avoid social interactions and are not known to interbreed, genetic studies to date have found extremely low levels of diversity in the mitochondrial control region, and few clear phylogeographic patterns worldwide. This low level of diversity is likely due to low mitochondrial mutation rates that are common to cetaceans. Using killer whales as a case study, we have developed a method to readily sequence, assemble, and analyze complete mitochondrial genomes from large numbers of samples to more accurately assess phylogeography and estimate divergence times. This represents an important tool for wildlife management, not only for killer whales but for many marine taxa. We used high-throughput sequencing to survey whole mitochondrial genome variation of 139 samples from the North Pacific, North Atlantic, and southern oceans. Phylogenetic analysis indicated that each of the known ecotypes represents a strongly supported clade with divergence times ranging from approximately 150,000 to 700,000 yr ago. We recommend that three named ecotypes be elevated to full species, and that the remaining types be recognized as subspecies pending additional data. Establishing appropriate taxonomic designations will greatly aid in understanding the ecological impacts and conservation needs of these important marine predators. We predict that phylogeographic mitogenomics will become an important tool for improved statistical phylogeography and more precise estimates of divergence times.
Assuntos
Genoma Mitocondrial/genética , Orca/classificação , Orca/genética , Animais , Sequência de Bases , Especiação Genética , Variação Genética/fisiologia , Geografia , Dados de Sequência Molecular , Oceanos e Mares , Filogenia , Análise de Sequência de DNA , Homologia de Sequência do Ácido Nucleico , Especificidade da EspécieRESUMO
A synergistic combination of two next-generation sequencing platforms with a detailed comparative BAC physical contig map provided a cost-effective assembly of the genome sequence of the domestic turkey (Meleagris gallopavo). Heterozygosity of the sequenced source genome allowed discovery of more than 600,000 high quality single nucleotide variants. Despite this heterozygosity, the current genome assembly (â¼1.1 Gb) includes 917 Mb of sequence assigned to specific turkey chromosomes. Annotation identified nearly 16,000 genes, with 15,093 recognized as protein coding and 611 as non-coding RNA genes. Comparative analysis of the turkey, chicken, and zebra finch genomes, and comparing avian to mammalian species, supports the characteristic stability of avian genomes and identifies genes unique to the avian lineage. Clear differences are seen in number and variety of genes of the avian immune system where expansions and novel genes are less frequent than examples of gene loss. The turkey genome sequence provides resources to further understand the evolution of vertebrate genomes and genetic variation underlying economically important quantitative traits in poultry. This integrated approach may be a model for providing both gene and chromosome level assemblies of other species with agricultural, ecological, and evolutionary interest.
Assuntos
Genoma , Perus/genética , Animais , Sequência de Bases , Mapeamento Cromossômico , DNA/genética , Polimorfismo de Nucleotídeo Único , Análise de Sequência de DNA , Homologia de Sequência do Ácido Nucleico , Especificidade da EspécieRESUMO
Herbivores can gain indirect access to recalcitrant carbon present in plant cell walls through symbiotic associations with lignocellulolytic microbes. A paradigmatic example is the leaf-cutter ant (Tribe: Attini), which uses fresh leaves to cultivate a fungus for food in specialized gardens. Using a combination of sugar composition analyses, metagenomics, and whole-genome sequencing, we reveal that the fungus garden microbiome of leaf-cutter ants is composed of a diverse community of bacteria with high plant biomass-degrading capacity. Comparison of this microbiome's predicted carbohydrate-degrading enzyme profile with other metagenomes shows closest similarity to the bovine rumen, indicating evolutionary convergence of plant biomass degrading potential between two important herbivorous animals. Genomic and physiological characterization of two dominant bacteria in the fungus garden microbiome provides evidence of their capacity to degrade cellulose. Given the recent interest in cellulosic biofuels, understanding how large-scale and rapid plant biomass degradation occurs in a highly evolved insect herbivore is of particular relevance for bioenergy.
Assuntos
Formigas/microbiologia , Biomassa , Comportamento Alimentar/fisiologia , Fungos/genética , Metagenoma/genética , Folhas de Planta/metabolismo , Animais , Biopolímeros/metabolismo , Metabolismo dos Carboidratos/genética , Bovinos , Análise por Conglomerados , Dados de Sequência Molecular , FilogeniaRESUMO
Endogenous small interfering RNAs (endo-siRNAs) regulate diverse gene expression programs in eukaryotes by either binding and cleaving mRNA targets or mediating heterochromatin formation; however, the mechanisms of endo-siRNA biogenesis, sorting, and target regulation remain poorly understood. Here we report the identification and function of a specific class of germline-generated endo-siRNAs in Caenorhabditis elegans that are 26 nt in length and contain a guanine at the first nucleotide position (i.e., 26G RNAs). 26G RNAs regulate gene expression during spermatogenesis and zygotic development, and their biogenesis requires the ERI-1 exonuclease and the RRF-3 RNA-dependent RNA polymerase (RdRP). Remarkably, we identified two nonoverlapping subclasses of 26G RNAs that sort into specific RNA-induced silencing complexes (RISCs) and differentially regulate distinct mRNA targets. Class I 26G RNAs target genes are expressed during spermatogenesis, whereas class II 26G RNAs are maternally inherited and silence gene expression during zygotic development. These findings implicate a class of endo-siRNAs in the global regulation of transcriptional programs required for fertility and development.
Assuntos
Caenorhabditis elegans/embriologia , Caenorhabditis elegans/genética , Regulação da Expressão Gênica no Desenvolvimento , Guanina/metabolismo , RNA Interferente Pequeno/metabolismo , Espermatogênese/genética , Zigoto/metabolismo , Animais , Proteínas de Caenorhabditis elegans/metabolismo , Exorribonucleases/metabolismo , Inativação Gênica , Células Germinativas/metabolismo , Masculino , RNA de Helmintos/classificação , RNA de Helmintos/metabolismo , RNA Interferente Pequeno/biossíntese , RNA Interferente Pequeno/classificação , Análise de Sequência de DNARESUMO
Cancers arise by the gradual accumulation of mutations in multiple genes. We now use shotgun pyrosequencing to characterize RNA mutations and expression levels unique to malignant pleural mesotheliomas (MPMs) and not present in control tissues. On average, 266 Mb of cDNA were sequenced from each of four MPMs, from a control pulmonary adenocarcinoma (ADCA), and from normal lung tissue. Previously observed differences in MPM RNA expression levels were confirmed. Point mutations were identified by using criteria that require the presence of the mutation in at least four reads and in both cDNA strands and the absence of the mutation from sequence databases, normal adjacent tissues, and other controls. In the four MPMs, 15 nonsynonymous mutations were discovered: 7 were point mutations, 3 were deletions, 4 were exclusively expressed as a consequence of imputed epigenetic silencing, and 1 was putatively expressed as a consequence of RNA editing. Notably, each MPM had a different mutation profile, and no mutated gene was previously implicated in MPM. Of the seven point mutations, three were observed in at least one tumor from 49 other MPM patients. The mutations were in genes that could be causally related to cancer and included XRCC6, PDZK1IP1, ACTR1A, and AVEN.
Assuntos
Regulação Neoplásica da Expressão Gênica , Mesotelioma/genética , Mutação , Proteínas de Neoplasias/genética , Neoplasias Pleurais/genética , Receptores de Ativinas Tipo I/genética , Proteínas Adaptadoras de Transdução de Sinal/genética , Antígenos Nucleares/genética , Proteínas Reguladoras de Apoptose/genética , Proteínas de Ligação a DNA/genética , Perfilação da Expressão Gênica , Inativação Gênica , Humanos , Autoantígeno Ku , Proteínas de Membrana/genética , Mutação Puntual , Edição de RNA , RNA Neoplásico , Deleção de SequênciaRESUMO
Tuberous sclerosis complex (TSC) is an autosomal dominant neurocutaneous syndrome caused by mutations in TSC1 and TSC2. However, 10-15% TSC patients have no mutation identified with conventional molecular diagnostic studies. We used the ultra-deep pyrosequencing technique of 454 Sequencing to search for mosaicism in 38 TSC patients who had no TSC1 or TSC2 mutation identified by conventional methods. Two TSC2 mutations were identified, each at 5.3% read frequency in different patients, consistent with mosaicism. Both mosaic mutations were confirmed by several methods. Five of 38 samples were found to have heterozygous non-mosaic mutations, which had been missed in earlier analyses. Several other possible low-frequency mosaic mutations were identified by deep sequencing, but were discarded as artifacts by secondary studies. The low frequency of detection of mosaic mutations, two (6%) of 33, suggests that the majority of TSC patients who have no mutation identified are not due to mosaicism, but rather other causes, which remain to be determined. These findings indicate the ability of deep sequencing, coupled with secondary confirmatory analyses, to detect low-frequency mosaic mutations.
Assuntos
Mosaicismo , Mutação , Análise de Sequência/métodos , Esclerose Tuberosa/genética , Proteínas Supressoras de Tumor/genética , Cromatografia Líquida de Alta Pressão , Genótipo , Humanos , Espectrometria de Massas , Fenótipo , Reação em Cadeia da Polimerase , Proteína 1 do Complexo Esclerose Tuberosa , Proteína 2 do Complexo Esclerose TuberosaRESUMO
Neutralizing antibodies have become an important tool in treating infectious diseases. Recently, two separate approaches yielded successful antibody treatments for Ebola-one from genetically humanized mice and the other from a human survivor. Here, we describe parallel efforts using both humanized mice and convalescent patients to generate antibodies against the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) spike protein, which yielded a large collection of fully human antibodies that were characterized for binding, neutralization, and three-dimensional structure. On the basis of these criteria, we selected pairs of highly potent individual antibodies that simultaneously bind the receptor binding domain of the spike protein, thereby providing ideal partners for a therapeutic antibody cocktail that aims to decrease the potential for virus escape mutants that might arise in response to selective pressure from a single-antibody treatment.
Assuntos
Anticorpos Neutralizantes/imunologia , Anticorpos Antivirais/imunologia , Betacoronavirus/imunologia , Infecções por Coronavirus/imunologia , Pneumonia Viral/imunologia , Glicoproteína da Espícula de Coronavírus/imunologia , Adolescente , Adulto , Enzima de Conversão de Angiotensina 2 , Animais , Anticorpos Neutralizantes/química , Anticorpos Antivirais/química , Afinidade de Anticorpos , Citotoxicidade Celular Dependente de Anticorpos , Betacoronavirus/química , Sítios de Ligação de Anticorpos , Anticorpos Amplamente Neutralizantes/química , Anticorpos Amplamente Neutralizantes/imunologia , COVID-19 , Linhagem Celular , Infecções por Coronavirus/terapia , Citofagocitose , Epitopos , Humanos , Imunização Passiva , Camundongos , Pessoa de Meia-Idade , Modelos Moleculares , Testes de Neutralização , Pandemias , Peptidil Dipeptidase A/metabolismo , Domínios e Motivos de Interação entre Proteínas , Receptores de Coronavírus , Receptores Virais/metabolismo , Coronavírus Relacionado à Síndrome Respiratória Aguda Grave/imunologia , SARS-CoV-2 , Glicoproteína da Espícula de Coronavírus/química , Glicoproteína da Espícula de Coronavírus/metabolismo , Adulto Jovem , Soroterapia para COVID-19RESUMO
BACKGROUND: The next generation sequencing technologies provide new options to characterize the transcriptome and to develop affordable tools for functional genomics. We describe here an innovative approach for this purpose and demonstrate its potential also for non-model species. RESULTS: The method we developed is based on 454 sequencing of 3' cDNA fragments from a normalized library constructed from pooled RNAs to generate, through de novo reads assembly, a large catalog of unique transcripts in organisms for which a comprehensive collection of transcripts or the complete genome sequence, is not available. This "virtual transcriptome" provides extensive coverage depth, and can be used for the setting up of a comprehensive microarray based expression analysis. We evaluated the potential of this approach by monitoring gene expression during berry maturation in Vitis vinifera as if no other sequence information was available for this species. The microarray designed on the berries' transcriptome derived from half of a 454 run detected the expression of 19,609 genes, and proved to be more informative than one of the most comprehensive grape microarrays available to date, the GrapeArray 1.2 developed by the Italian-French Public Consortium for Grapevine Genome Characterization, which could detect the expression of 15,556 genes in the same samples. CONCLUSION: This approach provides a powerful method to rapidly build up an extensive catalog of unique transcripts that can be successfully used to develop a microarray for large scale analysis of gene expression in any species, without the need for prior sequence knowledge.
Assuntos
Perfilação da Expressão Gênica , Análise de Sequência com Séries de Oligonucleotídeos , Análise de Sequência de DNA/métodos , Sequência de Bases , Etiquetas de Sequências Expressas/metabolismo , Frutas/genética , Frutas/crescimento & desenvolvimento , Biblioteca Gênica , Genes de Plantas/genética , Dados de Sequência Molecular , Sondas de Oligonucleotídeos/genética , RNA Mensageiro/genética , RNA de Plantas/genética , Vitis/genética , Vitis/crescimento & desenvolvimentoRESUMO
BACKGROUND: With a whole genome duplication event and wealth of biological data, salmonids are excellent model organisms for studying evolutionary processes, fates of duplicated genes and genetic and physiological processes associated with complex behavioral phenotypes. It is surprising therefore, that no salmonid genome has been sequenced. Atlantic salmon (Salmo salar) is a good representative salmonid for sequencing given its importance in aquaculture and the genomic resources available. However, the size and complexity of the genome combined with the lack of a sequenced reference genome from a closely related fish makes assembly challenging. Given the cost and time limitations of Sanger sequencing as well as recent improvements to next generation sequencing technologies, we examined the feasibility of using the Genome Sequencer (GS) FLX pyrosequencing system to obtain the sequence of a salmonid genome. Eight pooled BACs belonging to a minimum tiling path covering approximately 1 Mb of the Atlantic salmon genome were sequenced by GS FLX shotgun and Long Paired End sequencing and compared with a ninth BAC sequenced by Sanger sequencing of a shotgun library. RESULTS: An initial assembly using only GS FLX shotgun sequences (average read length 248.5 bp) with approximately 30x coverage allowed gene identification, but was incomplete even when 126 Sanger-generated BAC-end sequences (approximately 0.09x coverage) were incorporated. The addition of paired end sequencing reads (additional approximately 26x coverage) produced a final assembly comprising 175 contigs assembled into four scaffolds with 171 gaps. Sanger sequencing of the ninth BAC (approximately 10.5x coverage) produced nine contigs and two scaffolds. The number of scaffolds produced by the GS FLX assembly was comparable to Sanger-generated sequencing; however, the number of gaps was much higher in the GS FLX assembly. CONCLUSION: These results represent the first use of GS FLX paired end reads for de novo sequence assembly. Our data demonstrated that this improved the GS FLX assemblies; however, with respect to de novo sequencing of complex genomes, the GS FLX technology is limited to gene mining and establishing a set of ordered sequence contigs. Currently, for a salmonid reference sequence, it appears that a substantial portion of sequencing should be done using Sanger technology.
Assuntos
Genômica/métodos , Salmo salar/genética , Análise de Sequência de DNA/métodos , Animais , Cromossomos Artificiais Bacterianos/genética , Evolução Molecular , Duplicação Gênica , Biblioteca Gênica , Genoma , Genômica/instrumentação , Genômica/estatística & dados numéricos , Salmo salar/classificação , Salmonidae/classificação , Salmonidae/genética , Análise de Sequência de DNA/instrumentação , Análise de Sequência de DNA/estatística & dados numéricosRESUMO
Recently, genome-wide association studies have identified loci across a segment of chromosome 8q24 (128,100,000-128,700,000) associated with the risk of breast, colon and prostate cancers. At least three regions of 8q24 have been independently associated with prostate cancer risk; the most centromeric of which appears to be population specific. Haplotypes in two contiguous but independent loci, marked by rs6983267 and rs1447295, have been identified in the Cancer Genetic Markers of Susceptibility project ( http://cgems.cancer.gov ), which genotyped more than 5,000 prostate cancer cases and 5,000 controls of European origin. The rs6983267 locus is also strongly associated with colorectal cancer. To ascertain a comprehensive catalog of common single-nucleotide polymorphisms (SNPs) across the two regions, we conducted a resequence analysis of 136 kb (chr8: 128,473,000-128,609,802) using the Roche/454 next-generation sequencing technology in 39 prostate cancer cases and 40 controls of European origin. We have characterized a comprehensive catalog of common (MAF > 1%) SNPs within this region, including 442 novel SNPs and have determined the pattern of linkage disequilibrium across the region. Our study has generated a detailed map of genetic variation across the region, which should be useful for choosing SNPs for fine mapping of association signals in 8q24 and investigations of the functional consequences of select common variants.
Assuntos
Cromossomos Humanos Par 8 , Neoplasias do Colo/genética , Neoplasias da Próstata/genética , Estudos de Casos e Controles , Feminino , Frequência do Gene , Humanos , Desequilíbrio de Ligação , Masculino , Polimorfismo de Nucleotídeo Único , Análise de Sequência de DNA/métodosRESUMO
Bacillus thuringiensis (Bt) crystal (Cry) proteins are effective against a select number of insect pests, but improvements are needed to increase efficacy and decrease time to mortality for coleopteran pests. To gain insight into the Bt intoxication process in Coleoptera, we performed RNA-Seq on cDNA generated from the guts of Tenebrio molitor larvae that consumed either a control diet or a diet containing Cry3Aa protoxin. Approximately 134,090 and 124,287 sequence reads from the control and Cry3Aa-treated groups were assembled into 1,318 and 1,140 contigs, respectively. Enrichment analyses indicated that functions associated with mitochondrial respiration, signalling, maintenance of cell structure, membrane integrity, protein recycling/synthesis, and glycosyl hydrolases were significantly increased in Cry3Aa-treated larvae, whereas functions associated with many metabolic processes were reduced, especially glycolysis, tricarboxylic acid cycle, and fatty acid synthesis. Microarray analysis was used to evaluate temporal changes in gene expression after 6, 12 or 24 h of Cry3Aa exposure. Overall, microarray analysis indicated that transcripts related to allergens, chitin-binding proteins, glycosyl hydrolases, and tubulins were induced, and those related to immunity and metabolism were repressed in Cry3Aa-intoxicated larvae. The 24 h microarray data validated most of the RNA-Seq data. Of the three intoxication intervals, larvae demonstrated more differential expression of transcripts after 12 h exposure to Cry3Aa. Gene expression examined by three different methods in control vs. Cry3Aa-treated larvae at the 24 h time point indicated that transcripts encoding proteins with chitin-binding domain 3 were the most differentially expressed in Cry3Aa-intoxicated larvae. Overall, the data suggest that T. molitor larvae mount a complex response to Cry3Aa during the initial 24 h of intoxication. Data from this study represent the largest genetic sequence dataset for T. molitor to date. Furthermore, the methods in this study are useful for comparative analyses in organisms lacking a sequenced genome.
Assuntos
Proteínas de Bactérias/toxicidade , Vias Biossintéticas/efeitos dos fármacos , Endotoxinas/toxicidade , Metabolismo Energético/efeitos dos fármacos , Proteínas Hemolisinas/toxicidade , Tenebrio/efeitos dos fármacos , Tenebrio/metabolismo , Transcriptoma/efeitos dos fármacos , Administração Oral , Animais , Toxinas de Bacillus thuringiensis , Proteínas de Bactérias/administração & dosagem , Sequência de Bases , DNA Complementar/genética , Endotoxinas/administração & dosagem , Perfilação da Expressão Gênica , Proteínas Hemolisinas/administração & dosagem , Larva/efeitos dos fármacos , Larva/metabolismo , Análise em Microsséries , Dados de Sequência Molecular , Análise de Sequência de DNA , Tenebrio/genética , Fatores de TempoRESUMO
Tuberous sclerosis complex (TSC) is an often severe neurocutaneous syndrome. Cortical tubers are the predominant neuropathological finding in TSC, and their number and location has been shown to correlate roughly with the severity of neurologic features in TSC. Past studies have shown that genomic deletion events in TSC1 or TSC2 are very rare in tubers, and suggested the potential involvement of the MAPK pathway in their pathogenesis. We used deep sequencing to assess all coding exons of TSC1 and TSC2, and the activating mutation hot spots within KRAS in 46 tubers from TSC patients. Germline heterozygous mutations were identified in 81% of tubers. The same secondary mutation in TSC2 was identified in six tuber samples from one individual. Further study showed that this second hit mutation was widely distributed in the cortex from one cerebral hemisphere of this individual at frequencies up to 10%. No other secondary mutations were found in the other 40 tubers analyzed. These data indicate that small second hit mutations in any of these three genes are very rare in TSC tubers. However, in one TSC individual, a second hit TSC2 point mutation occurred early during brain development, and likely contributed to tuber formation.
Assuntos
Mutação/genética , Proteínas Proto-Oncogênicas/genética , Esclerose Tuberosa/genética , Esclerose Tuberosa/patologia , Proteínas Supressoras de Tumor/genética , Proteínas ras/genética , Encéfalo/patologia , Mutação em Linhagem Germinativa , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Imunoprecipitação/métodos , Espectrometria de Massas/métodos , Quinases de Proteína Quinase Ativadas por Mitógeno/genética , Proteínas Proto-Oncogênicas p21(ras) , Deleção de Sequência , Transdução de Sinais/fisiologia , Proteína 1 do Complexo Esclerose Tuberosa , Proteína 2 do Complexo Esclerose TuberosaRESUMO
Three-prime untranslated regions (3'UTRs) of metazoan messenger RNAs (mRNAs) contain numerous regulatory elements, yet remain largely uncharacterized. Using polyA capture, 3' rapid amplification of complementary DNA (cDNA) ends, full-length cDNAs, and RNA-seq, we defined approximately 26,000 distinct 3'UTRs in Caenorhabditis elegans for approximately 85% of the 18,328 experimentally supported protein-coding genes and revised approximately 40% of gene models. Alternative 3'UTR isoforms are frequent, often differentially expressed during development. Average 3'UTR length decreases with animal age. Surprisingly, no polyadenylation signal (PAS) was detected for 13% of polyadenylation sites, predominantly among shorter alternative isoforms. Trans-spliced (versus non-trans-spliced) mRNAs possess longer 3'UTRs and frequently contain no PAS or variant PAS. We identified conserved 3'UTR motifs, isoform-specific predicted microRNA target sites, and polyadenylation of most histone genes. Our data reveal a rich complexity of 3'UTRs, both genome-wide and throughout development.
Assuntos
Regiões 3' não Traduzidas , Caenorhabditis elegans/genética , Genes de Helmintos , RNA de Helmintos/genética , Animais , Sítios de Ligação , Caenorhabditis elegans/embriologia , Caenorhabditis elegans/crescimento & desenvolvimento , Biologia Computacional , Sequência Conservada , Transtornos do Desenvolvimento Sexual , Regulação da Expressão Gênica no Desenvolvimento , Biblioteca Gênica , Proteínas de Helminto/genética , Histonas/genética , Masculino , MicroRNAs/metabolismo , Óperon , Poli A/metabolismo , Poliadenilação , RNA Mensageiro/genética , Trans-SplicingRESUMO
Major histocompatibility complex (MHC) genetics dictate adaptive cellular immune responses, making robust MHC genotyping methods essential for studies of infectious disease, vaccine development and transplantation. Nonhuman primates provide essential preclinical models for these areas of biomedical research. Unfortunately, given the unparalleled complexity of macaque MHCs, existing methodologies are inadequate for MHC typing of these key model animals. Here we use pyrosequencing of complementary DNA-PCR amplicons as a general approach to determine comprehensive MHC class I genotypes in nonhuman primates. More than 500 unique MHC class I sequences were resolved by sequence-based typing of rhesus, cynomolgus and pig-tailed macaques, nearly half of which have not been reported previously. The remarkable sensitivity of this approach in macaques demonstrates that pyrosequencing is viable for ultra-high-throughput MHC genotyping of primates, including humans.