Search | VHL Regional Portal

1.

Vertebrate-class-specific binding modes of the alphavirus receptor MXRA8.

Zimmerman, Ofer; Zimmerman, Maxwell I; Raju, Saravanan; Nelson, Christopher A; Errico, John M; Madden, Emily A; Holmes, Autumn C; Hassan, Ahmed O; VanBlargan, Laura A; Kim, Arthur S; Adams, Lucas J; Basore, Katherine; Whitener, Bradley M; Palakurty, Sathvik; Davis-Adams, Hannah G; Sun, Chengqun; Gilliland, Theron; Earnest, James T; Ma, Hongming; Ebel, Gregory D; Zmasek, Christian; Scheuermann, Richard H; Klimstra, William B; Fremont, Daved H; Diamond, Michael S.

Cell ; 186(22): 4818-4833.e25, 2023 10 26.

Article in English | MEDLINE | ID: mdl-37804831

ABSTRACT

MXRA8 is a receptor for chikungunya (CHIKV) and other arthritogenic alphaviruses with mammalian hosts. However, mammalian MXRA8 does not bind to alphaviruses that infect humans and have avian reservoirs. Here, we show that avian, but not mammalian, MXRA8 can act as a receptor for Sindbis, western equine encephalitis (WEEV), and related alphaviruses with avian reservoirs. Structural analysis of duck MXRA8 complexed with WEEV reveals an inverted binding mode compared with mammalian MXRA8 bound to CHIKV. Whereas both domains of mammalian MXRA8 bind CHIKV E1 and E2, only domain 1 of avian MXRA8 engages WEEV E1, and no appreciable contacts are made with WEEV E2. Using these results, we generated a chimeric avian-mammalian MXRA8 decoy-receptor that neutralizes infection of multiple alphaviruses from distinct antigenic groups in vitro and in vivo. Thus, different alphaviruses can bind MXRA8 encoded by different vertebrate classes with distinct engagement modes, which enables development of broad-spectrum inhibitors.

Subject(s)

Alphavirus , Animals , Humans , Chikungunya Fever , Chikungunya virus/chemistry , Mammals , Receptors, Virus/metabolism

2.

Introducing the Bacterial and Viral Bioinformatics Resource Center (BV-BRC): a resource combining PATRIC, IRD and ViPR.

Olson, Robert D; Assaf, Rida; Brettin, Thomas; Conrad, Neal; Cucinell, Clark; Davis, James J; Dempsey, Donald M; Dickerman, Allan; Dietrich, Emily M; Kenyon, Ronald W; Kuscuoglu, Mehmet; Lefkowitz, Elliot J; Lu, Jian; Machi, Dustin; Macken, Catherine; Mao, Chunhong; Niewiadomska, Anna; Nguyen, Marcus; Olsen, Gary J; Overbeek, Jamie C; Parrello, Bruce; Parrello, Victoria; Porter, Jacob S; Pusch, Gordon D; Shukla, Maulik; Singh, Indresh; Stewart, Lucy; Tan, Gene; Thomas, Chris; VanOeffelen, Margo; Vonstein, Veronika; Wallace, Zachary S; Warren, Andrew S; Wattam, Alice R; Xia, Fangfang; Yoo, Hyunseung; Zhang, Yun; Zmasek, Christian M; Scheuermann, Richard H; Stevens, Rick L.

Nucleic Acids Res ; 51(D1): D678-D689, 2023 01 06.

Article in English | MEDLINE | ID: mdl-36350631

ABSTRACT

The National Institute of Allergy and Infectious Diseases (NIAID) established the Bioinformatics Resource Center (BRC) program to assist researchers with analyzing the growing body of genome sequence and other omics-related data. In this report, we describe the merger of the PAThosystems Resource Integration Center (PATRIC), the Influenza Research Database (IRD) and the Virus Pathogen Database and Analysis Resource (ViPR) BRCs to form the Bacterial and Viral Bioinformatics Resource Center (BV-BRC) https://www.bv-brc.org/. The combined BV-BRC leverages the functionality of the bacterial and viral resources to provide a unified data model, enhanced web-based visualization and analysis tools, bioinformatics services, and a powerful suite of command line tools that benefit the bacterial and viral research communities.

Subject(s)

Genomics , Software , Viruses , Humans , Bacteria/genetics , Computational Biology , Databases, Genetic , Influenza, Human , Viruses/genetics

3.

Early detection of emerging SARS-CoV-2 variants of interest for experimental evaluation.

Wallace, Zachary S; Davis, James; Niewiadomska, Anna Maria; Olson, Robert D; Shukla, Maulik; Stevens, Rick; Zhang, Yun; Zmasek, Christian M; Scheuermann, Richard H.

Front Bioinform ; 2: 1020189, 2022.

Article in English | MEDLINE | ID: mdl-36353215

ABSTRACT

Since the beginning of the COVID-19 pandemic, SARS-CoV-2 has demonstrated its ability to rapidly and continuously evolve, leading to the emergence of thousands of different sequence variants, many with distinctive phenotypic properties. Fortunately, the broad application of next generation sequencing (NGS) across the globe has produced a wealth of SARS-CoV-2 genome sequences, offering a comprehensive picture of how this virus is evolving so that accurate diagnostics, reliable therapeutics, and prophylactic vaccines against COVID-19 can be developed and maintained. The millions of SARS-CoV-2 sequences deposited into genomic sequencing databases, including GenBank, BV-BRC, and GISAID, are annotated with the dates and geographic locations of sample collection, and can be aligned to and compared with the Wuhan-Hu-1 reference genome to extract their constellation of nucleotide and amino acid substitutions. By aggregating these data into concise datasets, the spread of variants through space and time can be assessed. Variant tracking efforts have initially focused on the Spike protein due to its critical role in viral tropism and antibody neutralization. To identify emerging variants of concern as early as possible, we developed a computational pipeline to process the genomic data and assign risk scores based on both epidemiological and functional parameters. Epidemiological dynamics are used to identify variants exhibiting substantial growth over time and spread across geographical regions. Experimental data that quantify Spike protein regions targeted by adaptive immunity and critical for other virus characteristics are used to predict variants with consequential immunogenic and pathogenic impacts. The growth assessment and functional impact scores are combined to produce a Composite Score for any set of Spike substitutions detected. With this systematic method to routinely score and rank emerging variants, we have established an approach to identify threatening variants early and prioritize them for experimental evaluation.

4.

Genomic evolution of the Coronaviridae family.

Zmasek, Christian M; Lefkowitz, Elliot J; Niewiadomska, Anna; Scheuermann, Richard H.

Virology ; 570: 123-133, 2022 05.

Article in English | MEDLINE | ID: mdl-35398776

ABSTRACT

The current outbreak of coronavirus disease-2019 (COVID-19) caused by SARS-CoV-2 poses unparalleled challenges to global public health. SARS-CoV-2 is a Betacoronavirus, one of four genera belonging to the Coronaviridae subfamily Orthocoronavirinae. Coronaviridae, in turn, are members of the order Nidovirales, a group of enveloped, positive-stranded RNA viruses. Here we present a systematic phylogenetic and evolutionary study based on protein domain architecture, encompassing the entire proteomes of all Orthocoronavirinae, as well as other Nidovirales. This analysis has revealed that the genomic evolution of Nidovirales is associated with extensive gains and losses of protein domains. In Orthocoronavirinae, the sections of the genomes that show the largest divergence in protein domains are found in the proteins encoded in the amino-terminal end of the polyprotein (PP1ab), the spike protein (S), and many of the accessory proteins. The diversity among the accessory proteins is particularly striking, as each subgenus possesses a set of accessory proteins that is almost entirely specific to that subgenus. The only notable exception to this is ORF3b, which is present and orthologous over all Alphacoronaviruses. In contrast, the membrane protein (M), envelope small membrane protein (E), nucleoprotein (N), as well as proteins encoded in the central and carboxy-terminal end of PP1ab (such as the 3C-like protease, RNA-dependent RNA polymerase, and Helicase) show stable domain architectures across all Orthocoronavirinae. This comprehensive analysis of the Coronaviridae domain architecture has important implication for efforts to develop broadly cross-protective coronavirus vaccines.

Subject(s)

COVID-19 , Coronaviridae , Nidovirales , Coronaviridae/genetics , Evolution, Molecular , Humans , Membrane Proteins/genetics , Nidovirales/genetics , Phylogeny , SARS-CoV-2/genetics

5.

Classification of human Herpesviridae proteins using Domain-architecture Aware Inference of Orthologs (DAIO).

Zmasek, Christian M; Knipe, David M; Pellett, Philip E; Scheuermann, Richard H.

Virology ; 529: 29-42, 2019 03.

Article in English | MEDLINE | ID: mdl-30660046

ABSTRACT

We developed a computational approach called Domain-architecture Aware Inference of Orthologs (DAIO) for the analysis of protein orthology by combining phylogenetic and protein domain-architecture information. Using DAIO, we performed a systematic study of the proteomes of all human Herpesviridae species to define Strict Ortholog Groups (SOGs). In addition to assessing the taxonomic distribution for each protein based on sequence similarity, we performed a protein domain-architecture analysis for every protein family and computationally inferred gene duplication events. While many herpesvirus proteins have evolved without any detectable gene duplications or domain rearrangements, numerous herpesvirus protein families do exhibit complex evolutionary histories. Some proteins acquired additional domains (e.g., DNA polymerase), whereas others show a combination of domain acquisition and gene duplication (e.g., betaherpesvirus US22 family), with possible functional implications. This novel classification system of SOGs for human Herpesviridae proteins is available through the Virus Pathogen Resource (ViPR, www.viprbrc.org).

Subject(s)

Herpesviridae/genetics , Herpesviridae/metabolism , Phylogeny , Viral Proteins/chemistry , Viral Proteins/genetics , Capsid Proteins/genetics , Capsid Proteins/metabolism , Gene Duplication , Gene Expression Regulation, Viral , Peptide Hydrolases , Protein Domains , Uracil-DNA Glycosidase/chemistry , Uracil-DNA Glycosidase/genetics , Uracil-DNA Glycosidase/metabolism

6.

Hepatitis C Virus Database and Bioinformatics Analysis Tools in the Virus Pathogen Resource (ViPR).

Zhang, Yun; Zmasek, Christian; Sun, Guangyu; Larsen, Christopher N; Scheuermann, Richard H.

Methods Mol Biol ; 1911: 47-69, 2019.

Article in English | MEDLINE | ID: mdl-30593617

ABSTRACT

The Virus Pathogen Resource (ViPR; www.viprbrc.org ) is a US National Institute of Allergy and Infectious Diseases (NIAID)-sponsored Bioinformatics Resource Center providing bioinformatics support for major human viral pathogens. The hepatitis C virus (HCV) portal of ViPR facilitates basic research and development of diagnostics and therapeutics for HCV, by providing a comprehensive collection of HCV-related data integrated from various sources, a growing suite of analysis and visualization tools for data mining and hypothesis generation, and personal Workbench spaces for data storage and sharing. This chapter introduces the data and functionality provided by the ViPR HCV portal. It describes example workflows for (1) searching HCV genome and protein sequences, (2) conducting phylogenetic analysis, and (3) analyzing sequence variations using pattern search for amino acid substitutions in proteins, single nucleotide variation calculation, metadata-driven comparison, and sequence feature variant type analysis. All data and tools are freely available via the ViPR HCV portal at https://www.viprbrc.org/brc/home.spg?decorator=flavi_hcv .

Subject(s)

Genomics/methods , Hepacivirus/genetics , Software , Amino Acid Sequence , Amino Acid Substitution , Antiviral Agents/pharmacology , Data Mining , Databases, Genetic , Drug Resistance, Viral , Genome, Viral , Genotype , Hepacivirus/chemistry , Hepacivirus/drug effects , Hepatitis C/drug therapy , Hepatitis C/virology , Humans , Phylogeny , Polymorphism, Single Nucleotide , Viral Proteins/chemistry , Viral Proteins/genetics , Workflow

7.

Influenza Research Database: An integrated bioinformatics resource for influenza virus research.

Zhang, Yun; Aevermann, Brian D; Anderson, Tavis K; Burke, David F; Dauphin, Gwenaelle; Gu, Zhiping; He, Sherry; Kumar, Sanjeev; Larsen, Christopher N; Lee, Alexandra J; Li, Xiaomei; Macken, Catherine; Mahaffey, Colin; Pickett, Brett E; Reardon, Brian; Smith, Thomas; Stewart, Lucy; Suloway, Christian; Sun, Guangyu; Tong, Lei; Vincent, Amy L; Walters, Bryan; Zaremba, Sam; Zhao, Hongtao; Zhou, Liwei; Zmasek, Christian; Klem, Edward B; Scheuermann, Richard H.

Nucleic Acids Res ; 45(D1): D466-D474, 2017 01 04.

Article in English | MEDLINE | ID: mdl-27679478

ABSTRACT

The Influenza Research Database (IRD) is a U.S. National Institute of Allergy and Infectious Diseases (NIAID)-sponsored Bioinformatics Resource Center dedicated to providing bioinformatics support for influenza virus research. IRD facilitates the research and development of vaccines, diagnostics and therapeutics against influenza virus by providing a comprehensive collection of influenza-related data integrated from various sources, a growing suite of analysis and visualization tools for data mining and hypothesis generation, personal workbench spaces for data storage and sharing, and active user community support. Here, we describe the recent improvements in IRD including the use of cloud and high performance computing resources, analysis and visualization of user-provided sequence data with associated metadata, predictions of novel variant proteins, annotations of phenotype-associated sequence markers and their predicted phenotypic effects, hemagglutinin (HA) clade classifications, an automated tool for HA subtype numbering conversion, linkouts to disease event data and the addition of host factor and antiviral drug components. All data and tools are freely available without restriction from the IRD website at https://www.fludb.org.

Subject(s)

Computational Biology/methods , Databases, Factual , Influenza A virus , Research , Software , Influenza A virus/classification , Influenza A virus/physiology , Molecular Typing/methods , Phenotype , Phylogeny , Viral Proteins/genetics , Virulence

8.

Phylogenomic analysis of glycogen branching and debranching enzymatic duo.

Zmasek, Christian M; Godzik, Adam.

BMC Evol Biol ; 14: 183, 2014 Aug 23.

Article in English | MEDLINE | ID: mdl-25148856

ABSTRACT

BACKGROUND: Branched polymers of glucose are universally used for energy storage in cells, taking the form of glycogen in animals, fungi, Bacteria, and Archaea, and of amylopectin in plants. Some enzymes involved in glycogen and amylopectin metabolism are similarly conserved in all forms of life, but some, interestingly, are not. In this paper we focus on the phylogeny of glycogen branching and debranching enzymes, respectively involved in introducing and removing of the α(1-6) bonds in glucose polymers, bonds that provide the unique branching structure to glucose polymers. RESULTS: We performed a large-scale phylogenomic analysis of branching and debranching enzymes in over 400 completely sequenced genomes, including more than 200 from eukaryotes. We show that branching and debranching enzymes can be found in all kingdoms of life, including all major groups of eukaryotes, and thus were likely to have been present in the last universal common ancestor (LUCA) but have been lost in seemingly random fashion in numerous single-celled eukaryotes. We also show how animal branching and debranching enzymes evolved from their LUCA ancestors by acquiring additional domains. Furthermore, we show that enzymes commonly perceived as orthologous, such as human branching enzyme GBE1 and E. coli branching enzyme GlgB, are in fact related by a gene duplication and consequently paralogous. CONCLUSIONS: Despite being usually associated with animal liver glycogen and plant starch, energy storage in the form of branched glucose polymers is clearly an ancient process and has probably been present in the last universal common ancestor of all present life. The evolution of the enzymes enabling this form of energy storage is more complex than previously thought and illustrates the need for explicit phylogenomic analysis in the study of even seemingly "simple" metabolic enzymes. Patterns of conservation in the evolution of the glycogen/starch branching and debranching enzymes hint at some as yet unknown mechanisms, as mutations disrupting these patterns lead to a variety of genetic diseases in humans and other mammals.

Subject(s)

1,4-alpha-Glucan Branching Enzyme/genetics , Evolution, Molecular , Phylogeny , Animals , Bacteria/genetics , Eukaryota/classification , Eukaryota/enzymology , Eukaryota/metabolism , Glycogen/metabolism , Glycogen Storage Disease/enzymology , Humans , Plants/enzymology , Plants/genetics , Starch/metabolism

9.

Structural genomics analysis of uncharacterized protein families overrepresented in human gut bacteria identifies a novel glycoside hydrolase.

Sheydina, Anna; Eberhardt, Ruth Y; Rigden, Daniel J; Chang, Yuanyuan; Li, Zhanwen; Zmasek, Christian C; Axelrod, Herbert L; Godzik, Adam.

BMC Bioinformatics ; 15: 112, 2014 Apr 17.

Article in English | MEDLINE | ID: mdl-24742328

ABSTRACT

BACKGROUND: Bacteroides spp. form a significant part of our gut microbiome and are well known for optimized metabolism of diverse polysaccharides. Initial analysis of the archetypal Bacteroides thetaiotaomicron genome identified 172 glycosyl hydrolases and a large number of uncharacterized proteins associated with polysaccharide metabolism. RESULTS: BT_1012 from Bacteroides thetaiotaomicron VPI-5482 is a protein of unknown function and a member of a large protein family consisting entirely of uncharacterized proteins. Initial sequence analysis predicted that this protein has two domains, one on the N- and one on the C-terminal. A PSI-BLAST search found over 150 full length and over 90 half size homologs consisting only of the N-terminal domain. The experimentally determined three-dimensional structure of the BT_1012 protein confirms its two-domain architecture and structural analysis of both domains suggests their specific functions. The N-terminal domain is a putative catalytic domain with significant similarity to known glycoside hydrolases, the C-terminal domain has a beta-sandwich fold typically found in C-terminal domains of other glycosyl hydrolases, however these domains are typically involved in substrate binding. We describe the structure of the BT_1012 protein and discuss its sequence-structure relationship and their possible functional implications. CONCLUSIONS: Structural and sequence analyses of the BT_1012 protein identifies it as a glycosyl hydrolase, expanding an already impressive catalog of enzymes involved in polysaccharide metabolism in Bacteroides spp. Based on this we have renamed the Pfam families representing the two domains found in the BT_1012 protein, PF13204 and PF12904, as putative glycoside hydrolase and glycoside hydrolase-associated C-terminal domain respectively.

Subject(s)

Bacterial Proteins/chemistry , Glycoside Hydrolases/chemistry , Amino Acid Sequence , Bacterial Proteins/genetics , Bacteroides/enzymology , Computational Biology , Gastrointestinal Tract/microbiology , Genomics , Glycoside Hydrolases/genetics , Humans , Protein Structure, Tertiary

10.

Divergent evolution of protein conformational dynamics in dihydrofolate reductase.

Bhabha, Gira; Ekiert, Damian C; Jennewein, Madeleine; Zmasek, Christian M; Tuttle, Lisa M; Kroon, Gerard; Dyson, H Jane; Godzik, Adam; Wilson, Ian A; Wright, Peter E.

Nat Struct Mol Biol ; 20(11): 1243-9, 2013 Nov.

Article in English | MEDLINE | ID: mdl-24077226

ABSTRACT

Molecular evolution is driven by mutations, which may affect the fitness of an organism and are then subject to natural selection or genetic drift. Analysis of primary protein sequences and tertiary structures has yielded valuable insights into the evolution of protein function, but little is known about the evolution of functional mechanisms, protein dynamics and conformational plasticity essential for activity. We characterized the atomic-level motions across divergent members of the dihydrofolate reductase (DHFR) family. Despite structural similarity, Escherichia coli and human DHFRs use different dynamic mechanisms to perform the same function, and human DHFR cannot complement DHFR-deficient E. coli cells. Identification of the primary-sequence determinants of flexibility in DHFRs from several species allowed us to propose a likely scenario for the evolution of functionally important DHFR dynamics following a pattern of divergent evolution that is tuned by cellular environment.

Subject(s)

Evolution, Molecular , Tetrahydrofolate Dehydrogenase/chemistry , Tetrahydrofolate Dehydrogenase/genetics , Amino Acid Sequence , Crystallography, X-Ray , Escherichia coli/enzymology , Genetic Complementation Test , Genetic Drift , Humans , Models, Molecular , Molecular Dynamics Simulation , Molecular Sequence Data , Mutation , Protein Conformation , Selection, Genetic , Sequence Alignment

11.

aLeaves facilitates on-demand exploration of metazoan gene family trees on MAFFT sequence alignment server with enhanced interactivity.

Kuraku, Shigehiro; Zmasek, Christian M; Nishimura, Osamu; Katoh, Kazutaka.

Nucleic Acids Res ; 41(Web Server issue): W22-8, 2013 Jul.

Article in English | MEDLINE | ID: mdl-23677614

ABSTRACT

We report a new web server, aLeaves (http://aleaves.cdb.riken.jp/), for homologue collection from diverse animal genomes. In molecular comparative studies involving multiple species, orthology identification is the basis on which most subsequent biological analyses rely. It can be achieved most accurately by explicit phylogenetic inference. More and more species are subjected to large-scale sequencing, but the resultant resources are scattered in independent project-based, and multi-species, but separate, web sites. This complicates data access and is becoming a serious barrier to the comprehensiveness of molecular phylogenetic analysis. aLeaves, launched to overcome this difficulty, collects sequences similar to an input query sequence from various data sources. The collected sequences can be passed on to the MAFFT sequence alignment server (http://mafft.cbrc.jp/alignment/server/), which has been significantly improved in interactivity. This update enables to switch between (i) sequence selection using the Archaeopteryx tree viewer, (ii) multiple sequence alignment and (iii) tree inference. This can be performed as a loop until one reaches a sensible data set, which minimizes redundancy for better visibility and handling in phylogenetic inference while covering relevant taxa. The work flow achieved by the seamless link between aLeaves and MAFFT provides a convenient online platform to address various questions in zoology and evolutionary biology.

Subject(s)

Phylogeny , Sequence Alignment/methods , Software , Animals , CCCTC-Binding Factor , Genes , Genes, Homeobox , Genome , Humans , Internet , Proteins/genetics , Repressor Proteins/chemistry , Sequence Analysis, Protein , Vertebrates/genetics , Wnt Proteins/chemistry

12.

Phylotastic! Making tree-of-life knowledge accessible, reusable and convenient.

Stoltzfus, Arlin; Lapp, Hilmar; Matasci, Naim; Deus, Helena; Sidlauskas, Brian; Zmasek, Christian M; Vaidya, Gaurav; Pontelli, Enrico; Cranston, Karen; Vos, Rutger; Webb, Campbell O; Harmon, Luke J; Pirrung, Megan; O'Meara, Brian; Pennell, Matthew W; Mirarab, Siavash; Rosenberg, Michael S; Balhoff, James P; Bik, Holly M; Heath, Tracy A; Midford, Peter E; Brown, Joseph W; McTavish, Emily Jane; Sukumaran, Jeet; Westneat, Mark; Alfaro, Michael E; Steele, Aaron; Jordan, Greg.

BMC Bioinformatics ; 14: 158, 2013 May 13.

Article in English | MEDLINE | ID: mdl-23668630

ABSTRACT

BACKGROUND: Scientists rarely reuse expert knowledge of phylogeny, in spite of years of effort to assemble a great "Tree of Life" (ToL). A notable exception involves the use of Phylomatic, which provides tools to generate custom phylogenies from a large, pre-computed, expert phylogeny of plant taxa. This suggests great potential for a more generalized system that, starting with a query consisting of a list of any known species, would rectify non-standard names, identify expert phylogenies containing the implicated taxa, prune away unneeded parts, and supply branch lengths and annotations, resulting in a custom phylogeny suited to the user's needs. Such a system could become a sustainable community resource if implemented as a distributed system of loosely coupled parts that interact through clearly defined interfaces. RESULTS: With the aim of building such a "phylotastic" system, the NESCent Hackathons, Interoperability, Phylogenies (HIP) working group recruited 2 dozen scientist-programmers to a weeklong programming hackathon in June 2012. During the hackathon (and a three-month follow-up period), 5 teams produced designs, implementations, documentation, presentations, and tests including: (1) a generalized scheme for integrating components; (2) proof-of-concept pruners and controllers; (3) a meta-API for taxonomic name resolution services; (4) a system for storing, finding, and retrieving phylogenies using semantic web technologies for data exchange, storage, and querying; (5) an innovative new service, DateLife.org, which synthesizes pre-computed, time-calibrated phylogenies to assign ages to nodes; and (6) demonstration projects. These outcomes are accessible via a public code repository (GitHub.com), a website (http://www.phylotastic.org), and a server image. CONCLUSIONS: Approximately 9 person-months of effort (centered on a software development hackathon) resulted in the design and implementation of proof-of-concept software for 4 core phylotastic components, 3 controllers, and 3 end-user demonstration tools. While these products have substantial limitations, they suggest considerable potential for a distributed system that makes phylogenetic knowledge readily accessible in computable form. Widespread use of phylotastic systems will create an electronic marketplace for sharing phylogenetic knowledge that will spur innovation in other areas of the ToL enterprise, such as annotation of sources and methods and third-party methods of quality assessment.

Subject(s)

Phylogeny , Software , Internet

13.

Evolution of the animal apoptosis network.

Zmasek, Christian M; Godzik, Adam.

Cold Spring Harb Perspect Biol ; 5(3): a008649, 2013 Mar 01.

Article in English | MEDLINE | ID: mdl-23457257

ABSTRACT

The number of available eukaryotic genomes has expanded to the point where we can evaluate the complete evolutionary history of many cellular processes. Such analyses for the apoptosis regulatory networks suggest that this network already existed in the ancestor of the entire animal kingdom (Metazoa) in a form more complex than in some popular animal model organisms. This supports the growing realization that regulatory networks do not necessarily evolve from simple to complex and that the relative simplicity of these networks in nematodes and insects does not represent an ancestral state, but is the result of secondary simplifications. Network evolution is not a process of monotonous increase in complexity, but a dynamic process that includes lineage-specific gene losses and expansions, protein domain reshuffling, and emergence/reemergence of similar protein architectures by parallel evolution. Studying the evolution of such networks is a challenging yet interesting subject for research and investigation, and such studies on the apoptosis networks provide us with interesting hints of how these networks, critical in so many human diseases, have developed.

Subject(s)

Apoptosis/physiology , Biological Evolution , Immunity, Innate/physiology , Phylogeny , Protein Structure, Tertiary/physiology , Animals , Apoptosis/genetics , Apoptotic Protease-Activating Factor 1/genetics , Apoptotic Protease-Activating Factor 1/physiology , Caspases/genetics , Caspases/physiology , Cyclin D1/genetics , Cyclin D1/physiology , Humans , Protein Structure, Tertiary/genetics , Species Specificity , Tumor Suppressor Protein p53/genetics , Tumor Suppressor Protein p53/physiology

14.

This Déjà vu feeling--analysis of multidomain protein evolution in eukaryotic genomes.

Zmasek, Christian M; Godzik, Adam.

PLoS Comput Biol ; 8(11): e1002701, 2012.

Article in English | MEDLINE | ID: mdl-23166479

ABSTRACT

Evolutionary innovation in eukaryotes and especially animals is at least partially driven by genome rearrangements and the resulting emergence of proteins with new domain combinations, and thus potentially novel functionality. Given the random nature of such rearrangements, one could expect that proteins with particularly useful multidomain combinations may have been rediscovered multiple times by parallel evolution. However, existing reports suggest a minimal role of this phenomenon in the overall evolution of eukaryotic proteomes. We assembled a collection of 172 complete eukaryotic genomes that is not only the largest, but also the most phylogenetically complete set of genomes analyzed so far. By employing a maximum parsimony approach to compare repertoires of Pfam domains and their combinations, we show that independent evolution of domain combinations is significantly more prevalent than previously thought. Our results indicate that about 25% of all currently observed domain combinations have evolved multiple times. Interestingly, this percentage is even higher for sets of domain combinations in individual species, with, for instance, 70% of the domain combinations found in the human genome having evolved independently at least once in other species. We also show that previous, much lower estimates of this rate are most likely due to the small number and biased phylogenetic distribution of the genomes analyzed. The process of independent emergence of identical domain combination is widespread, not limited to domains with specific functional categories. Besides data from large-scale analyses, we also present individual examples of independent domain combination evolution. The surprisingly large contribution of parallel evolution to the development of the domain combination repertoire in extant genomes has profound consequences for our understanding of the evolution of pathways and cellular processes in eukaryotes and for comparative functional genomics.

Subject(s)

Eukaryota/genetics , Evolution, Molecular , Protein Structure, Tertiary/genetics , Proteins/genetics , Animals , Genome , Models, Biological , Phylogeny , Proteins/chemistry

15.

A mighty small heart: the cardiac proteome of adult Drosophila melanogaster.

Cammarato, Anthony; Ahrens, Christian H; Alayari, Nakissa N; Qeli, Ermir; Rucker, Jasma; Reedy, Mary C; Zmasek, Christian M; Gucek, Marjan; Cole, Robert N; Van Eyk, Jennifer E; Bodmer, Rolf; O'Rourke, Brian; Bernstein, Sanford I; Foster, D Brian.

PLoS One ; 6(4): e18497, 2011 Apr 25.

Article in English | MEDLINE | ID: mdl-21541028

ABSTRACT

Drosophila melanogaster is emerging as a powerful model system for the study of cardiac disease. Establishing peptide and protein maps of the Drosophila heart is central to implementation of protein network studies that will allow us to assess the hallmarks of Drosophila heart pathogenesis and gauge the degree of conservation with human disease mechanisms on a systems level. Using a gel-LC-MS/MS approach, we identified 1228 protein clusters from 145 dissected adult fly hearts. Contractile, cytostructural and mitochondrial proteins were most abundant consistent with electron micrographs of the Drosophila cardiac tube. Functional/Ontological enrichment analysis further showed that proteins involved in glycolysis, Ca(2+)-binding, redox, and G-protein signaling, among other processes, are also over-represented. Comparison with a mouse heart proteome revealed conservation at the level of molecular function, biological processes and cellular components. The subsisting peptidome encompassed 5169 distinct heart-associated peptides, of which 1293 (25%) had not been identified in a recent Drosophila peptide compendium. PeptideClassifier analysis was further used to map peptides to specific gene-models. 1872 peptides provide valuable information about protein isoform groups whereas a further 3112 uniquely identify specific protein isoforms and may be used as a heart-associated peptide resource for quantitative proteomic approaches based on multiple-reaction monitoring. In summary, identification of excitation-contraction protein landmarks, orthologues of proteins associated with cardiovascular defects, and conservation of protein ontologies, provides testimony to the heart-like character of the Drosophila cardiac tube and to the utility of proteomics as a complement to the power of genetics in this growing model of human heart disease.

Subject(s)

Aging/metabolism , Drosophila melanogaster/metabolism , Myocardium/metabolism , Proteome/metabolism , Animals , Drosophila Proteins/chemistry , Drosophila Proteins/classification , Drosophila Proteins/metabolism , Drosophila melanogaster/cytology , Drosophila melanogaster/ultrastructure , Humans , Mass Spectrometry , Mice , Molecular Sequence Annotation , Myocardium/cytology , Myocardium/ultrastructure , Peptides/metabolism , Species Specificity

16.

Strong functional patterns in the evolution of eukaryotic genomes revealed by the reconstruction of ancestral protein domain repertoires.

Zmasek, Christian M; Godzik, Adam.

Genome Biol ; 12(1): R4, 2011.

Article in English | MEDLINE | ID: mdl-21241503

ABSTRACT

BACKGROUND: Genome size and complexity, as measured by the number of genes or protein domains, is remarkably similar in most extant eukaryotes and generally exhibits no correlation with their morphological complexity. Underlying trends in the evolution of the functional content and capabilities of different eukaryotic genomes might be hidden by simultaneous gains and losses of genes. RESULTS: We reconstructed the domain repertoires of putative ancestral species at major divergence points, including the last eukaryotic common ancestor (LECA). We show that, surprisingly, during eukaryotic evolution domain losses in general outnumber domain gains. Only at the base of the animal and the vertebrate sub-trees do domain gains outnumber domain losses. The observed gain/loss balance has a distinct functional bias, most strikingly seen during animal evolution, where most of the gains represent domains involved in regulation and most of the losses represent domains with metabolic functions. This trend is so consistent that clustering of genomes according to their functional profiles results in an organization similar to the tree of life. Furthermore, our results indicate that metabolic functions lost during animal evolution are likely being replaced by the metabolic capabilities of symbiotic organisms such as gut microbes. CONCLUSIONS: While protein domain gains and losses are common throughout eukaryote evolution, losses oftentimes outweigh gains and lead to significant differences in functional profiles. Results presented here provide additional arguments for a complex last eukaryotic common ancestor, but also show a general trend of losses in metabolic capabilities and gain in regulatory complexity during the rise of animals.

Subject(s)

Eukaryota/genetics , Evolution, Molecular , Genome , Protein Structure, Tertiary , Eukaryota/classification , Eukaryota/metabolism , Humans , Intestinal Mucosa/metabolism , Intestines/microbiology , Metagenome , Models, Genetic

17.

TOPSAN: a dynamic web database for structural genomics.

Ellrott, Kyle; Zmasek, Christian M; Weekes, Dana; Sri Krishna, S; Bakolitsa, Constantina; Godzik, Adam; Wooley, John.

Nucleic Acids Res ; 39(Database issue): D494-6, 2011 Jan.

Article in English | MEDLINE | ID: mdl-20961957

ABSTRACT

The Open Protein Structure Annotation Network (TOPSAN) is a web-based collaboration platform for exploring and annotating structures determined by structural genomics efforts. Characterization of those structures presents a challenge since the majority of the proteins themselves have not yet been characterized. Responding to this challenge, the TOPSAN platform facilitates collaborative annotation and investigation via a user-friendly web-based interface pre-populated with automatically generated information. Semantic web technologies expand and enrich TOPSAN's content through links to larger sets of related databases, and thus, enable data integration from disparate sources and data mining via conventional query languages. TOPSAN can be found at http://www.topsan.org.

Subject(s)

Databases, Protein , Protein Conformation , Genomics , Proteins/chemistry , Proteins/genetics , User-Computer Interface

18.

GreenPhylDB v2.0: comparative and functional genomics in plants.

Rouard, Mathieu; Guignon, Valentin; Aluome, Christelle; Laporte, Marie-Angélique; Droc, Gaëtan; Walde, Christian; Zmasek, Christian M; Périn, Christophe; Conte, Matthieu G.

Nucleic Acids Res ; 39(Database issue): D1095-102, 2011 Jan.

Article in English | MEDLINE | ID: mdl-20864446

ABSTRACT

GreenPhylDB is a database designed for comparative and functional genomics based on complete genomes. Version 2 now contains sixteen full genomes of members of the plantae kingdom, ranging from algae to angiosperms, automatically clustered into gene families. Gene families are manually annotated and then analyzed phylogenetically in order to elucidate orthologous and paralogous relationships. The database offers various lists of gene families including plant, phylum and species specific gene families. For each gene cluster or gene family, easy access to gene composition, protein domains, publications, external links and orthologous gene predictions is provided. Web interfaces have been further developed to improve the navigation through information related to gene families. New analysis tools are also available, such as a gene family ontology browser that facilitates exploration. GreenPhylDB is a component of the South Green Bioinformatics Platform (http://southgreen.cirad.fr/) and is accessible at http://greenphyl.cirad.fr. It enables comparative genomics in a broad taxonomy context to enhance the understanding of evolutionary processes and thus tends to speed up gene discovery.

Subject(s)

Databases, Genetic , Genome, Plant , Genes, Plant , Genomics , Molecular Sequence Annotation , Phylogeny , Plant Proteins/chemistry , Plant Proteins/genetics , Plants/classification , Plants/genetics , Software

19.

TIR domain-containing adaptor SARM is a late addition to the ongoing microbe-host dialog.

Zhang, Qing; Zmasek, Christian M; Cai, Xiaohui; Godzik, Adam.

Dev Comp Immunol ; 35(4): 461-8, 2011 Apr.

Article in English | MEDLINE | ID: mdl-21110998

ABSTRACT

Toll/interleukin-1 receptor (TIR) domain-containing proteins play important roles in defense against pathogens in both animals and plants, connecting the immunity signaling pathways via a chain of specific protein-protein interactions. Among them is SARM, the only TIR domain-containing adaptor that can negatively regulate TLR signaling. By extensive phylogenetic analysis, we show here that SARM is closely related to bacterial proteins with TIR domains, suggesting that this family has a different evolutionary history from other animal TIR-containing adaptors, possibly emerging via a lateral gene transfer from bacteria to animals. We also show evidence of several similar, independent transfer events, none of which, however, survived in vertebrates. An evolutionary relationship between the animal SARM adaptor and bacterial proteins with TIR domains illustrates the possible role that bacterial TIR-containing proteins play in regulating eukaryotic immune responses and how this mechanism was possibly adapted by the eukaryotes themselves.

Subject(s)

Cytoskeletal Proteins/genetics , Cytoskeletal Proteins/immunology , Evolution, Molecular , Animals , Cytoskeletal Proteins/chemistry , Gene Transfer, Horizontal , Host-Pathogen Interactions , Humans , Metagenomics , Models, Molecular , Phylogeny , Protein Structure, Tertiary , Receptors, Interleukin-1/chemistry , Receptors, Interleukin-1/genetics , Receptors, Interleukin-1/immunology , Signal Transduction

20.

Activation and specificity of human caspase-10.

Wachmann, Katherine; Pop, Cristina; van Raam, Bram J; Drag, Marcin; Mace, Peter D; Snipas, Scott J; Zmasek, Christian; Schwarzenbacher, Robert; Salvesen, Guy S; Riedl, Stefan J.

Biochemistry ; 49(38): 8307-15, 2010 Sep 28.

Article in English | MEDLINE | ID: mdl-20795673

ABSTRACT

Two apical caspases, caspase-8 and -10, are involved in the extrinsic death receptor pathway in humans, but it is mainly caspase-8 in its apoptotic and nonapoptotic functions that has been an intense research focus. In this study we concentrate on caspase-10, its mechanism of activation, and the role of the intersubunit cleavage. Our data obtained through in vitro dimerization assays strongly suggest that caspase-10 follows the proximity-induced dimerization model for apical caspases. Furthermore, we compare the specificity and activity of the wild-type protease with a mutant incapable of autoprocessing by using positional scanning substrate analysis and cleavage of natural protein substrates. These experiments reveal a striking difference between the wild type and the mutant, leading us to hypothesize that the single chain enzyme has restricted activity on most proteins but high activity on the proapoptotic protein Bid, potentially supporting a prodeath role for both cleaved and uncleaved caspase-10.

Subject(s)

Caspase 10/biosynthesis , BH3 Interacting Domain Death Agonist Protein/metabolism , Dimerization , Endopeptidases/metabolism , Enzyme Activation , Humans , Substrate Specificity

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL