RESUMEN
Algal lipids are important molecules to store energy in algae and transfer energy in the marine food chain, and are potential materials for high value nutraceuticals (e.g., omega-3 fatty acids) or biofuel production. However, how lipid biosynthesis is regulated is not well understood in many species including Eutreptiella from the phylum of Euglenozoa. Here, we characterized the fatty acid (FA) profile of an Eutreptiella species isolated from Long Island Sound, USA, using gas chromatography-tandem mass spectrometry (GC/MS/MS) and investigated their biosynthesis pathways by transcriptome sequencing. We discovered 24 types of FAs including a relatively high proportion of long-chain unsaturated FAs. The abundances of C16, C18, and saturated FAs decreased when phosphate in the culture medium was depleted. Among the 24 FAs, docosahexaenoic acid (C22:6∆4,7,10,13,16,19 ) was most abundant, suggesting that Eutreptiella sp. preferentially invests in the synthesis of long-chain polyunsaturated fatty acids (LC-PFAs). Further transcriptomic analysis revealed that Eutreptiella sp. likely synthesizes LC-PFAs via ∆8 pathway and uses type I and II fatty acid synthases. Using RT-qPCR, we found that some of the lipid synthesis genes, such as ß-ketoacyl-ACP reductase, fatty acid desaturase, acetyl-CoA carboxylase, acyl carrier protein, ∆8 desaturase, and Acyl-ACP thioesterase, were more actively expressed during light period, and two carbon fixation genes were up-regulated in the high-lipid illuminated cultures, suggesting a linkage between photosynthesis and lipid production. The lipid profile renders Eutreptiella sp. a nutritional prey and valuable source for nutraceuticals, and the biosynthesis pathway documented here will be useful for future research and applications.
Asunto(s)
Euglenozoos , Transcriptoma , Ácidos Grasos , Ácidos Grasos Insaturados , Espectrometría de Masas en TándemRESUMEN
Within-population genetic diversity is greatest within Africa, while between-population genetic diversity is directly proportional to geographic distance. The most divergent contemporary human populations include the click-speaking forager peoples of southern Africa, broadly defined as Khoesan. Both intra- (Bantu expansion) and inter-continental migration (European-driven colonization) have resulted in complex patterns of admixture between ancient geographically isolated Khoesan and more recently diverged populations. Using gender-specific analysis and almost 1 million autosomal markers, we determine the significance of estimated ancestral contributions that have shaped five contemporary southern African populations in a cohort of 103 individuals. Limited by lack of available data for homogenous Khoesan representation, we identify the Ju/'hoan (nâ=â19) as a distinct early diverging human lineage with little to no significant non-Khoesan contribution. In contrast to the Ju/'hoan, we identify ancient signatures of Khoesan and Bantu unions resulting in significant Khoesan- and Bantu-derived contributions to the Southern Bantu amaXhosa (nâ=â15) and Khoesan !Xun (nâ=â14), respectively. Our data further suggests that contemporary !Xun represent distinct Khoesan prehistories. Khoesan assimilation with European settlement at the most southern tip of Africa resulted in significant ancestral Khoesan contributions to the Coloured (nâ=â25) and Baster (nâ=â30) populations. The latter populations were further impacted by 170 years of East Indian slave trade and intra-continental migrations resulting in a complex pattern of genetic variation (admixture). The populations of southern Africa provide a unique opportunity to investigate the genomic variability from some of the oldest human lineages to the implications of complex admixture patterns including ancient and recently diverged human lineages.
Asunto(s)
Población Negra/genética , Variación Genética , Genética de Población , Genoma Humano , África Austral , Pueblo Asiatico/genética , ADN Mitocondrial , Femenino , Genotipo , Humanos , Masculino , Filogeografía , Población Blanca/genéticaRESUMEN
The International Molecular Exchange (IMEx) consortium is an international collaboration between major public interaction data providers to share literature-curation efforts and make a nonredundant set of protein interactions available in a single search interface on a common website (http://www.imexconsortium.org/). Common curation rules have been developed, and a central registry is used to manage the selection of articles to enter into the dataset. We discuss the advantages of such a service to the user, our quality-control measures and our data-distribution practices.
Asunto(s)
Bases de Datos de Proteínas , Mapeo de Interacción de Proteínas , Proteínas/metabolismo , Publicaciones Periódicas como Asunto , Unión Proteica , Proteínas/química , Control de CalidadRESUMEN
Babesia bovis is an apicomplexan tick-transmitted pathogen of cattle imposing a global risk and severe constraints to livestock health and economic development. The complete genome sequence was undertaken to facilitate vaccine antigen discovery, and to allow for comparative analysis with the related apicomplexan hemoprotozoa Theileria parva and Plasmodium falciparum. At 8.2 Mbp, the B. bovis genome is similar in size to that of Theileria spp. Structural features of the B. bovis and T. parva genomes are remarkably similar, and extensive synteny is present despite several chromosomal rearrangements. In contrast, B. bovis and P. falciparum, which have similar clinical and pathological features, have major differences in genome size, chromosome number, and gene complement. Chromosomal synteny with P. falciparum is limited to microregions. The B. bovis genome sequence has allowed wide scale analyses of the polymorphic variant erythrocyte surface antigen protein (ves1 gene) family that, similar to the P. falciparum var genes, is postulated to play a role in cytoadhesion, sequestration, and immune evasion. The approximately 150 ves1 genes are found in clusters that are distributed throughout each chromosome, with an increased concentration adjacent to a physical gap on chromosome 1 that contains multiple ves1-like sequences. ves1 clusters are frequently linked to a novel family of variant genes termed smorfs that may themselves contribute to immune evasion, may play a role in variant erythrocyte surface antigen protein biology, or both. Initial expression analysis of ves1 and smorf genes indicates coincident transcription of multiple variants. B. bovis displays a limited metabolic potential, with numerous missing pathways, including two pathways previously described for the P. falciparum apicoplast. This reduced metabolic potential is reflected in the B. bovis apicoplast, which appears to have fewer nuclear genes targeted to it than other apicoplast containing organisms. Finally, comparative analyses have identified several novel vaccine candidates including a positional homolog of p67 and SPAG-1, Theileria sporozoite antigens targeted for vaccine development. The genome sequence provides a greater understanding of B. bovis metabolism and potential avenues for drug therapies and vaccine development.
Asunto(s)
Babesia bovis/genética , ADN Protozoario/análisis , Genes Protozoarios , Plasmodium falciparum/genética , Theileria parva/genética , Animales , Antígenos de Protozoos/inmunología , Babesia bovis/inmunología , Babesia bovis/metabolismo , Babesiosis/parasitología , Secuencia de Bases , Proteínas Portadoras/genética , Proteínas Portadoras/inmunología , Proteínas Portadoras/metabolismo , Cromosomas , ADN Complementario/análisis , Evolución Molecular , Biblioteca Genómica , Datos de Secuencia Molecular , Plasmodium falciparum/inmunología , Plasmodium falciparum/metabolismo , Proteínas Protozoarias/genética , Proteínas Protozoarias/inmunología , Proteínas Protozoarias/metabolismo , Análisis de Secuencia de ADN , Especificidad de la Especie , Sintenía , Theileria parva/inmunología , Theileria parva/metabolismoAsunto(s)
Disciplinas de las Ciencias Biológicas/métodos , Disciplinas de las Ciencias Biológicas/tendencias , Biología Computacional/tendencias , Bases de Datos Factuales/tendencias , Almacenamiento y Recuperación de la Información/tendencias , Internet/tendencias , Animales , Selección de Profesión , Biología Computacional/educación , Biología Computacional/métodos , Bases de Datos Factuales/estadística & datos numéricos , Educación de Postgrado , Humanos , Almacenamiento y Recuperación de la Información/métodos , Internet/estadística & datos numéricos , Edición/tendenciasRESUMEN
The spliced alignment of expressed sequence data to genomic sequence has proven a key tool in the comprehensive annotation of genes in eukaryotic genomes. A novel algorithm was developed to assemble clusters of overlapping transcript alignments (ESTs and full-length cDNAs) into maximal alignment assemblies, thereby comprehensively incorporating all available transcript data and capturing subtle splicing variations. Complete and partial gene structures identified by this method were used to improve The Institute for Genomic Research Arabidopsis genome annotation (TIGR release v.4.0). The alignment assemblies permitted the automated modeling of several novel genes and >1000 alternative splicing variations as well as updates (including UTR annotations) to nearly half of the approximately 27 000 annotated protein coding genes. The algorithm of the Program to Assemble Spliced Alignments (PASA) tool is described, as well as the results of automated updates to Arabidopsis gene annotations.
Asunto(s)
Arabidopsis/genética , Genoma de Planta , ARN de Planta/análisis , Alineación de Secuencia/métodos , Programas Informáticos , Algoritmos , Empalme Alternativo , Arabidopsis/metabolismo , ADN Complementario/análisis , Etiquetas de Secuencia Expresada , Intrones , Proteínas de Plantas/genética , ARN de Planta/química , Transcripción Genética , Regiones no TraducidasRESUMEN
We report here the sequence of chromosome II from Trypanosoma brucei, the causative agent of African sleeping sickness. The 1.2-Mb pairs encode about 470 predicted genes organised in 17 directional clusters on either strand, the largest cluster of which has 92 genes lined up over a 284-kb region. An analysis of the GC skew reveals strand compositional asymmetries that coincide with the distribution of protein-coding genes, suggesting these asymmetries may be the result of transcription-coupled repair on coding versus non-coding strand. A 5-cM genetic map of the chromosome reveals recombinational 'hot' and 'cold' regions, the latter of which is predicted to include the putative centromere. One end of the chromosome consists of a 250-kb region almost exclusively composed of RHS (pseudo)genes that belong to a newly characterised multigene family containing a hot spot of insertion for retroelements. Interspersed with the RHS genes are a few copies of truncated RNA polymerase pseudogenes as well as expression site associated (pseudo)genes (ESAGs) 3 and 4, and 76 bp repeats. These features are reminiscent of a vestigial variant surface glycoprotein (VSG) gene expression site. The other end of the chromosome contains a 30-kb array of VSG genes, the majority of which are pseudogenes, suggesting that this region may be a site for modular de novo construction of VSG gene diversity during transposition/gene conversion events.
Asunto(s)
Cromosomas/genética , ADN Protozoario/genética , Trypanosoma brucei brucei/genética , Animales , Antígenos de Protozoos/genética , Mapeo Cromosómico , ADN Protozoario/química , Duplicación de Gen , Genes Protozoarios/genética , Datos de Secuencia Molecular , Seudogenes/genética , Recombinación Genética , Análisis de Secuencia de ADNRESUMEN
In the lysosome, glycosidases degrade glycolipids, glycoproteins, and oligosaccharides. Mutations in glycosidases cause disorders characterized by the deposition of undegraded carbohydrates. Schindler and Fabry diseases are caused by the incomplete degradation of carbohydrates with terminal alpha-N-acetylgalactosamine and alpha-galactose, respectively. Here we present the X-ray structure of alpha-N-acetylgalactosaminidase (alpha-NAGAL), the glycosidase that removes alpha-N-acetylgalactosamine, and the structure with bound ligand. The active site residues of alpha-NAGAL are conserved in the closely related enzyme a-galactosidase A (alpha-GAL). The structure demonstrates the catalytic mechanisms of both enzymes and reveals the structural basis of mutations causing Schindler and Fabry diseases. As alpha-NAGAL and alpha-GAL produce type O "universal donor" blood from type A and type B blood, the alpha-NAGAL structure will aid in the engineering of improved enzymes for blood conversion.
Asunto(s)
Hexosaminidasas , Enfermedades por Almacenamiento Lisosomal/enzimología , Estructura Terciaria de Proteína , Acetilgalactosamina/metabolismo , Secuencia de Aminoácidos , Animales , Sitios de Unión , Catálisis , Pollos , Cristalografía por Rayos X , Dimerización , Hexosaminidasas/química , Hexosaminidasas/deficiencia , Hexosaminidasas/genética , Hexosaminidasas/metabolismo , Humanos , Ligandos , Enfermedades por Almacenamiento Lisosomal/genética , Lisosomas/enzimología , Modelos Moleculares , Datos de Secuencia Molecular , Estructura Molecular , Mutación , Estructura Secundaria de Proteína , Alineación de Secuencia , alfa-N-AcetilgalactosaminidasaRESUMEN
BACKGROUND: Since the initial publication of its complete genome sequence, Arabidopsis thaliana has become more important than ever as a model for plant research. However, the initial genome annotation was submitted by multiple centers using inconsistent methods, making the data difficult to use for many applications. RESULTS: Over the course of three years, TIGR has completed its effort to standardize the structural and functional annotation of the Arabidopsis genome. Using both manual and automated methods, Arabidopsis gene structures were refined and gene products were renamed and assigned to Gene Ontology categories. We present an overview of the methods employed, tools developed, and protocols followed, summarizing the contents of each data release with special emphasis on our final annotation release (version 5). CONCLUSION: Over the entire period, several thousand new genes and pseudogenes were added to the annotation. Approximately one third of the originally annotated gene models were significantly refined yielding improved gene structure annotations, and every protein-coding gene was manually inspected and classified using Gene Ontology terms.
Asunto(s)
Arabidopsis/clasificación , Arabidopsis/genética , Biología Computacional/métodos , Genoma de Planta/genética , Análisis de Secuencia de Proteína/métodos , Escritura , Empalme Alternativo/genética , Biología Computacional/normas , Modelos Genéticos , Proteínas de Plantas/clasificación , Proteínas de Plantas/genéticaRESUMEN
Ticks transmit more pathogens to humans and animals than any other arthropod. We describe the 2.1 Gbp nuclear genome of the tick, Ixodes scapularis (Say), which vectors pathogens that cause Lyme disease, human granulocytic anaplasmosis, babesiosis and other diseases. The large genome reflects accumulation of repetitive DNA, new lineages of retro-transposons, and gene architecture patterns resembling ancient metazoans rather than pancrustaceans. Annotation of scaffolds representing â¼57% of the genome, reveals 20,486 protein-coding genes and expansions of gene families associated with tick-host interactions. We report insights from genome analyses into parasitic processes unique to ticks, including host 'questing', prolonged feeding, cuticle synthesis, blood meal concentration, novel methods of haemoglobin digestion, haem detoxification, vitellogenesis and prolonged off-host survival. We identify proteins associated with the agent of human granulocytic anaplasmosis, an emerging disease, and the encephalitis-causing Langat virus, and a population structure correlated to life-history traits and transmission of the Lyme disease agent.
Asunto(s)
Anaplasma phagocytophilum , Vectores Arácnidos/genética , Genoma/genética , Ixodes/genética , Canales Iónicos Activados por Ligandos/genética , Animales , Perfilación de la Expresión Génica , Genómica , Enfermedad de Lyme/transmisión , Oocitos , Xenopus laevisRESUMEN
Eutreptiella are an evolutionarily unique and ecologically important genus of microalgae, but they are poorly understood with regard to their genomic make-up and expression profiles. Through the analysis of the full-length cDNAs from a Eutreptiella species, we found a conserved 28-nt spliced leader sequence (Eut-SL, ACACUUUCUGAGUGUCUAUUUUUUUUCG) was trans-spliced to the mRNAs of Eutreptiella sp. Using a primer derived from Eut-SL, we constructed four cDNA libraries under contrasting physiological conditions for 454 pyrosequencing. Clustering analysis of the â¼1.9×10(6) original reads (average length 382 bp) yielded 36,643 unique transcripts. Although only 28% of the transcripts matched documented genes, this fraction represents a functionally very diverse gene set, suggesting that SL trans-splicing is likely ubiquitous in this alga's transcriptome. The mRNAs of Eutreptiella sp. seemed to have short 5'- untranslated regions, estimated to be 21 nucleotides on average. Among the diverse biochemical pathways represented in the transcriptome we obtained, carbonic anhydrase and genes known to function in the C4 pathway and heterotrophic carbon fixation were found, posing a question whether Eutreptiella sp. employs multifaceted strategies to acquire and fix carbon efficiently. This first large-scale transcriptomic dataset for a euglenoid uncovers many potential novel genes and overall offers a valuable genetic resource for research on euglenoid algae.
Asunto(s)
Regiones no Traducidas 5' , Ciclo del Carbono/genética , Euglénidos/genética , Microalgas/genética , ARN Lider Empalmado/genética , Trans-Empalme , Transcriptoma , Secuencia de Bases , Euglénidos/clasificación , Euglénidos/metabolismo , Perfilación de la Expresión Génica , Biblioteca de Genes , Secuenciación de Nucleótidos de Alto Rendimiento , Microalgas/clasificación , Microalgas/metabolismo , Datos de Secuencia Molecular , Filogenia , ARN Lider Empalmado/metabolismoRESUMEN
BACKGROUND: Ichthyophthirius multifiliis, commonly known as Ich, is a highly pathogenic ciliate responsible for 'white spot', a disease causing significant economic losses to the global aquaculture industry. Options for disease control are extremely limited, and Ich's obligate parasitic lifestyle makes experimental studies challenging. Unlike most well-studied protozoan parasites, Ich belongs to a phylum composed primarily of free-living members. Indeed, it is closely related to the model organism Tetrahymena thermophila. Genomic studies represent a promising strategy to reduce the impact of this disease and to understand the evolutionary transition to parasitism. RESULTS: We report the sequencing, assembly and annotation of the Ich macronuclear genome. Compared with its free-living relative T. thermophila, the Ich genome is reduced approximately two-fold in length and gene density and three-fold in gene content. We analyzed in detail several gene classes with diverse functions in behavior, cellular function and host immunogenicity, including protein kinases, membrane transporters, proteases, surface antigens and cytoskeletal components and regulators. We also mapped by orthology Ich's metabolic pathways in comparison with other ciliates and a potential host organism, the zebrafish Danio rerio. CONCLUSIONS: Knowledge of the complete protein-coding and metabolic potential of Ich opens avenues for rational testing of therapeutic drugs that target functions essential to this parasite but not to its fish hosts. Also, a catalog of surface protein-encoding genes will facilitate development of more effective vaccines. The potential to use T. thermophila as a surrogate model offers promise toward controlling 'white spot' disease and understanding the adaptation to a parasitic lifestyle.
Asunto(s)
Infecciones por Cilióforos/prevención & control , Genómica/métodos , Hymenostomatida/genética , Estadios del Ciclo de Vida , Pez Cebra/parasitología , Animales , Antígenos de Protozoos/genética , Composición de Base , Mapeo Cromosómico , ADN Mitocondrial/genética , ADN Protozoario/genética , Bases de Datos Genéticas , Genes Protozoarios , Tamaño del Genoma , Interacciones Huésped-Parásitos , Hymenostomatida/clasificación , Hymenostomatida/crecimiento & desarrollo , Hymenostomatida/patogenicidad , Ictaluridae/parasitología , Macronúcleo/genética , Proteínas de Transporte de Membrana/genética , Redes y Vías Metabólicas , Mitocondrias/enzimología , Mitocondrias/genética , ATPasas de Translocación de Protón Mitocondriales/genética , Anotación de Secuencia Molecular , Filogenia , Proteínas Quinasas/clasificación , Proteínas Quinasas/genética , Proteínas Protozoarias/genética , ARN Protozoario/genética , Pez Cebra/genéticaRESUMEN
Efforts to annotate the genomes of a wide variety of model organisms are currently carried out by sequencing centers, model organism databases and academic/institutional laboratories around the world. Different annotation methods and tools have been developed over time to meet the needs of biologists faced with the task of annotating biological data. While standardized methods are essential for consistent curation within each annotation group, methods and tools can differ between groups, especially when the groups are curating different organisms. Biocurators from several institutes met at the Third International Biocuration Conference in Berlin, Germany, April 2009 and hosted the 'Best Practices in Genome Annotation: Inference from Evidence' workshop to share their strategies, pipelines, standards and tools. This article documents the material presented in the workshop.
RESUMEN
We present a draft sequence of the genome of Aedes aegypti, the primary vector for yellow fever and dengue fever, which at approximately 1376 million base pairs is about 5 times the size of the genome of the malaria vector Anopheles gambiae. Nearly 50% of the Ae. aegypti genome consists of transposable elements. These contribute to a factor of approximately 4 to 6 increase in average gene length and in sizes of intergenic regions relative to An. gambiae and Drosophila melanogaster. Nonetheless, chromosomal synteny is generally maintained among all three insects, although conservation of orthologous gene order is higher (by a factor of approximately 2) between the mosquito species than between either of them and the fruit fly. An increase in genes encoding odorant binding, cytochrome P450, and cuticle domains relative to An. gambiae suggests that members of these protein families underpin some of the biological differences between the two mosquito species.
Asunto(s)
Aedes/genética , Genoma de los Insectos , Insectos Vectores/genética , Aedes/metabolismo , Animales , Anopheles/genética , Anopheles/metabolismo , Arbovirus , Secuencia de Bases , Elementos Transponibles de ADN , Dengue/prevención & control , Dengue/transmisión , Drosophila melanogaster/genética , Femenino , Genes de Insecto , Humanos , Proteínas de Insectos/genética , Insectos Vectores/metabolismo , Masculino , Proteínas de Transporte de Membrana/genética , Datos de Secuencia Molecular , Familia de Multigenes , Estructura Terciaria de Proteína/genética , Análisis de Secuencia de ADN , Caracteres Sexuales , Procesos de Determinación del Sexo , Especificidad de la Especie , Sintenía , Transcripción Genética , Fiebre Amarilla/prevención & control , Fiebre Amarilla/transmisiónRESUMEN
African trypanosomes cause human sleeping sickness and livestock trypanosomiasis in sub-Saharan Africa. We present the sequence and analysis of the 11 megabase-sized chromosomes of Trypanosoma brucei. The 26-megabase genome contains 9068 predicted genes, including approximately 900 pseudogenes and approximately 1700 T. brucei-specific genes. Large subtelomeric arrays contain an archive of 806 variant surface glycoprotein (VSG) genes used by the parasite to evade the mammalian immune system. Most VSG genes are pseudogenes, which may be used to generate expressed mosaic genes by ectopic recombination. Comparisons of the cytoskeleton and endocytic trafficking systems with those of humans and other eukaryotic organisms reveal major differences. A comparison of metabolic pathways encoded by the genomes of T. brucei, T. cruzi, and Leishmania major reveals the least overall metabolic capability in T. brucei and the greatest in L. major. Horizontal transfer of genes of bacterial origin has contributed to some of the metabolic differences in these parasites, and a number of novel potential drug targets have been identified.