ABSTRACT
The apicomplexan intracellular parasite Toxoplasma gondii is a major food borne pathogen that is highly prevalent in the global population. The majority of the T. gondii proteome remains uncharacterized and the organization of proteins into complexes is unclear. To overcome this knowledge gap, we used a biochemical fractionation strategy to predict interactions by correlation profiling. To overcome the deficit of high-quality training data in non-model organisms, we complemented a supervised machine learning strategy, with an unsupervised approach, based on similarity network fusion. The resulting combined high confidence network, ToxoNet, comprises 2,063 interactions connecting 652 proteins. Clustering identifies 93 protein complexes. We identified clusters enriched in mitochondrial machinery that include previously uncharacterized proteins that likely represent novel adaptations to oxidative phosphorylation. Furthermore, complexes enriched in proteins localized to secretory organelles and the inner membrane complex, predict additional novel components representing novel targets for detailed functional characterization. We present ToxoNet as a publicly available resource with the expectation that it will help drive future hypotheses within the research community.
Subject(s)
Protein Interaction Maps , Protozoan Proteins , Toxoplasma , Toxoplasma/metabolism , Protozoan Proteins/metabolism , Protozoan Proteins/chemistry , Protein Interaction Maps/physiology , Computational Biology , Protein Interaction Mapping/methods , Proteome/metabolism , Databases, Protein , Machine Learning , Cluster AnalysisABSTRACT
Macromolecular complexes are essential to conserved biological processes, but their prevalence across animals is unclear. By combining extensive biochemical fractionation with quantitative mass spectrometry, here we directly examined the composition of soluble multiprotein complexes among diverse metazoan models. Using an integrative approach, we generated a draft conservation map consisting of more than one million putative high-confidence co-complex interactions for species with fully sequenced genomes that encompasses functional modules present broadly across all extant animals. Clustering reveals a spectrum of conservation, ranging from ancient eukaryotic assemblies that have probably served cellular housekeeping roles for at least one billion years, ancestral complexes that have accrued contemporary components, and rarer metazoan innovations linked to multicellularity. We validated these projections by independent co-fractionation experiments in evolutionarily distant species, affinity purification and functional analyses. The comprehensiveness, centrality and modularity of these reconstructed interactomes reflect their fundamental mechanistic importance and adaptive value to animal cell systems.
Subject(s)
Evolution, Molecular , Multiprotein Complexes/chemistry , Multiprotein Complexes/metabolism , Protein Interaction Maps , Animals , Datasets as Topic , Humans , Protein Interaction Mapping , Reproducibility of Results , Systems Biology , Tandem Mass SpectrometryABSTRACT
Long-term memories are thought to depend upon the coordinated activation of a broad network of cortical and subcortical brain regions. However, the distributed nature of this representation has made it challenging to define the neural elements of the memory trace, and lesion and electrophysiological approaches provide only a narrow window into what is appreciated a much more global network. Here we used a global mapping approach to identify networks of brain regions activated following recall of long-term fear memories in mice. Analysis of Fos expression across 84 brain regions allowed us to identify regions that were co-active following memory recall. These analyses revealed that the functional organization of long-term fear memories depends on memory age and is altered in mutant mice that exhibit premature forgetting. Most importantly, these analyses indicate that long-term memory recall engages a network that has a distinct thalamic-hippocampal-cortical signature. This network is concurrently integrated and segregated and therefore has small-world properties, and contains hub-like regions in the prefrontal cortex and thalamus that may play privileged roles in memory expression.
Subject(s)
Brain/physiology , Fear , Memory , Nerve Net , Animals , Immunohistochemistry , Mice , Mice, Mutant StrainsABSTRACT
As the interface between a microbe and its environment, the bacterial cell envelope has broad biological and clinical significance. While numerous biosynthesis genes and pathways have been identified and studied in isolation, how these intersect functionally to ensure envelope integrity during adaptive responses to environmental challenge remains unclear. To this end, we performed high-density synthetic genetic screens to generate quantitative functional association maps encompassing virtually the entire cell envelope biosynthetic machinery of Escherichia coli under both auxotrophic (rich medium) and prototrophic (minimal medium) culture conditions. The differential patterns of genetic interactions detected among > 235,000 digenic mutant combinations tested reveal unexpected condition-specific functional crosstalk and genetic backup mechanisms that ensure stress-resistant envelope assembly and maintenance. These networks also provide insights into the global systems connectivity and dynamic functional reorganization of a universal bacterial structure that is both broadly conserved among eubacteria (including pathogens) and an important target.
Subject(s)
Cell Membrane/genetics , Epistasis, Genetic/genetics , Escherichia coli/genetics , Escherichia coli/metabolism , Membrane Proteins/genetics , Microtubule-Associated Proteins/genetics , Culture Media , Drug Resistance/genetics , Escherichia coli/growth & development , Gene Expression Regulation, Bacterial , Gene-Environment Interaction , Membrane Proteins/metabolism , Metabolic Networks and Pathways/genetics , Microscopy, Electron , Microtubule-Associated Proteins/metabolism , Molecular Sequence Annotation , Oligonucleotide Array Sequence AnalysisABSTRACT
Chromatin modification (CM) is a set of epigenetic processes that govern many aspects of DNA replication, transcription and repair. CM is carried out by groups of physically interacting proteins, and their disruption has been linked to a number of complex human diseases. CM remains largely unexplored, however, especially in higher eukaryotes such as human. Here we present the DAnCER resource, which integrates information on genes with CM function from five model organisms, including human. Currently integrated are gene functional annotations, Pfam domain architecture, protein interaction networks and associated human diseases. Additional supporting evidence includes orthology relationships across organisms, membership in protein complexes, and information on protein 3D structure. These data are available for 962 experimentally confirmed and manually curated CM genes and for over 5000 genes with predicted CM function on the basis of orthology and domain composition. DAnCER allows visual explorations of the integrated data and flexible query capabilities using a variety of data filters. In particular, disease information and functional annotations are mapped onto the protein interaction networks, enabling the user to formulate new hypotheses on the function and disease associations of a given gene based on those of its interaction partners. DAnCER is freely available at http://wodaklab.org/dancer/.
Subject(s)
Chromatin/metabolism , Databases, Genetic , Disease/genetics , Epigenomics , Animals , Caenorhabditis elegans/genetics , Drosophila melanogaster/genetics , Humans , Mice , Molecular Sequence Annotation , Protein Conformation , Protein Interaction Mapping , Saccharomyces cerevisiae/geneticsABSTRACT
BACKGROUND: Whole microbiome RNASeq (metatranscriptomics) has emerged as a powerful technology to functionally interrogate microbial communities. A key challenge is how best to process, analyze, and interpret these complex datasets. In a typical application, a single metatranscriptomic dataset may comprise from tens to hundreds of millions of sequence reads. These reads must first be processed and filtered for low quality and potential contaminants, before being annotated with taxonomic and functional labels and subsequently collated to generate global bacterial gene expression profiles. RESULTS: Here, we present MetaPro, a flexible, massively scalable metatranscriptomic data analysis pipeline that is cross-platform compatible through its implementation within a Docker framework. MetaPro starts with raw sequence read input (single-end or paired-end reads) and processes them through a tiered series of filtering, assembly, and annotation steps. In addition to yielding a final list of bacterial genes and their relative expression, MetaPro delivers a taxonomic breakdown based on the consensus of complementary prediction algorithms, together with a focused breakdown of enzymes, readily visualized through the Cytoscape network visualization tool. We benchmark the performance of MetaPro against two current state-of-the-art pipelines and demonstrate improved performance and functionality. CONCLUSIONS: MetaPro represents an effective integrated solution for the processing and analysis of metatranscriptomic datasets. Its modular architecture allows new algorithms to be deployed as they are developed, ensuring its longevity. To aid user uptake of the pipeline, MetaPro, together with an established tutorial that has been developed for educational purposes, is made freely available at https://github.com/ParkinsonLab/MetaPro . The software is freely available under the GNU general public license v3. Video Abstract.
Subject(s)
Microbiota , Microbiota/genetics , Software , Algorithms , Bacteria/genetics , Genes, BacterialABSTRACT
Advances in high throughput 'omic technologies are starting to provide unprecedented insights into how components of biological systems are organized and interact. Key to exploiting these datasets is the definition of the components that comprise the system of interest. Although a variety of knowledge bases exist that capture such information, a major challenge is determining how these resources may be best utilized. Here we present a systematic curation strategy to define a systems-level view of the human extracellular matrix (ECM)--a three-dimensional meshwork of proteins and polysaccharides that impart structure and mechanical stability to tissues. Employing our curation strategy we define a set of 357 proteins that represent core components of the ECM, together with an additional 524 genes that mediate related functional roles, and construct a map of their physical interactions. Topological properties help identify modules of functionally related proteins, including those involved in cell adhesion, bone formation and blood clotting. Because of its major role in cell adhesion, proliferation and morphogenesis, defects in the ECM have been implicated in cancer, atherosclerosis, asthma, fibrosis, and arthritis. We use MeSH annotations to identify modules enriched for specific disease terms that aid to strengthen existing as well as predict novel gene-disease associations. Mapping expression and conservation data onto the network reveal modules evolved in parallel to convey tissue-specific functionality on otherwise broadly expressed units. In addition to demonstrating an effective workflow for defining biological systems, this study crystallizes our current knowledge surrounding the organization of the ECM.
Subject(s)
Extracellular Matrix Proteins/chemistry , Extracellular Matrix Proteins/metabolism , Protein Interaction Mapping/methods , Protein Interaction Maps , Systems Biology/methods , Cluster Analysis , Gene Expression Profiling , HumansABSTRACT
SUMMARY: With increasing numbers of eukaryotic genome sequences, phylogenetic profiles of eukaryotic genes are becoming increasingly informative. Here, we introduce a new web-tool Phylopro (http://compsysbio.org/phylopro/), which uses the 120 available eukaryotic genome sequences to visualize the evolutionary trajectories of user-defined subsets of model organism genes. Applied to pathways or complexes, PhyloPro allows the user to rapidly identify core conserved elements of biological processes together with those that may represent lineage-specific innovations. PhyloPro thus provides a valuable resource for the evolutionary and comparative studies of biological systems.
Subject(s)
Computational Biology/methods , Genomics/methods , Internet , Phylogeny , Biological Evolution , Cluster Analysis , Eukaryota/classification , Eukaryota/genetics , Programming Languages , User-Computer InterfaceABSTRACT
During chronic infection, the single celled parasite, Toxoplasma gondii, can migrate to the brain where it has been associated with altered dopamine function and the capacity to modulate host behavior, increasing risk of neurocognitive disorders. Here we explore alterations in dopamine-related behavior in a new mouse model based on stimulant (cocaine)-induced hyperactivity. In combination with cocaine, infection resulted in heightened sensorimotor deficits and impairment in prepulse inhibition response, which are commonly disrupted in neuropsychiatric conditions. To identify molecular pathways in the brain affected by chronic T. gondii infection, we investigated patterns of gene expression. As expected, infection was associated with an enrichment of genes associated with general immune response pathways, that otherwise limits statistical power to identify more informative pathways. To overcome this limitation and focus on pathways of neurological relevance, we developed a novel context enrichment approach that relies on a customized ontology. Applying this approach, we identified genes that exhibited unexpected patterns of expression arising from the combination of cocaine exposure and infection. These include sets of genes which exhibited dampened response to cocaine in infected mice, suggesting a possible mechanism for some observed behaviors and a neuroprotective effect that may be advantageous to parasite persistence. This model offers a powerful new approach to dissect the molecular pathways by which T. gondii infection contributes to neurocognitive disorders.
Subject(s)
Cocaine , Toxoplasma , Animals , Brain/parasitology , Cocaine/metabolism , Dopamine , Gene Expression , Male , MiceABSTRACT
BACKGROUND: The emergence of antimicrobial resistance is a major threat to global health and has placed pressure on the livestock industry to eliminate the use of antibiotic growth promotants (AGPs) as feed additives. To mitigate their removal, efficacious alternatives are required. AGPs are thought to operate through modulating the gut microbiome to limit opportunities for colonization by pathogens, increase nutrient utilization, and reduce inflammation. However, little is known concerning the underlying mechanisms. Previous studies investigating the effects of AGPs on the poultry gut microbiome have largely focused on 16S rDNA surveys based on a single gastrointestinal (GI) site, diet, and/or timepoint, resulting in an inconsistent view of their impact on community composition. METHODS: In this study, we perform a systematic investigation of both the composition and function of the chicken gut microbiome, in response to AGPs. Birds were raised under two different diets and AGP treatments, and 16S rDNA surveys applied to six GI sites sampled at three key timepoints of the poultry life cycle. Functional investigations were performed through metatranscriptomics analyses and metabolomics. RESULTS: Our study reveals a more nuanced view of the impact of AGPs, dependent on age of bird, diet, and intestinal site sampled. Although AGPs have a limited impact on taxonomic abundances, they do appear to redefine influential taxa that may promote the exclusion of other taxa. Microbiome expression profiles further reveal a complex landscape in both the expression and taxonomic representation of multiple pathways including cell wall biogenesis, antimicrobial resistance, and several involved in energy, amino acid, and nucleotide metabolism. Many AGP-induced changes in metabolic enzyme expression likely serve to redirect metabolic flux with the potential to regulate bacterial growth or produce metabolites that impact the host. CONCLUSIONS: As alternative feed additives are developed to mimic the action of AGPs, our study highlights the need to ensure such alternatives result in functional changes that are consistent with site-, age-, and diet-associated taxa. The genes and pathways identified in this study are therefore expected to drive future studies, applying tools such as community-based metabolic modeling, focusing on the mechanistic impact of different dietary regimes on the microbiome. Consequently, the data generated in this study will be crucial for the development of next-generation feed additives targeting gut health and poultry production. Video Abstract.
Subject(s)
Gastrointestinal Microbiome , Animals , Anti-Bacterial Agents/pharmacology , Chickens , DNA, Ribosomal , Dietary Supplements , Gastrointestinal Microbiome/geneticsABSTRACT
Model organisms such as yeast, fly, and worm have played a defining role in the study of many biological systems. A significant challenge remains in translating this information to humans. Of critical importance is the ability to differentiate those components where knowledge of function and interactions may be reliably inferred from those that represent lineage-specific innovations. To address this challenge, we use chromatin modification (CM) as a model system for exploring the evolutionary properties of their components in the context of their known functions and interactions. Collating previously identified components of CM from yeast, worm, fly, and human, we identified a "core" set of 50 CM genes displaying consistent orthologous relationships that likely retain their interactions and functions across taxa. In addition, we catalog many components that demonstrate lineage specific expansions and losses, highlighting much duplication within vertebrates that may reflect an expanded repertoire of regulatory mechanisms. Placed in the context of a high-quality protein-protein interaction network, we find, contrary to existing views of evolutionary modularity, that CM complex components display a mosaic of evolutionary histories: a core set of highly conserved genes, together with sets displaying lineage specific innovations. Although focused on CM, this study provides a template for differentiating those genes which are likely to retain their functions and interactions across species. As such, in addition to informing on the evolution of CM as a system, this study provides a set of comparative genomic approaches that can be generally applied to any biological systems.
Subject(s)
Chromatin Assembly and Disassembly/genetics , Chromatin/genetics , Computational Biology/methods , Models, Genetic , Protein Interaction Mapping/methods , Animals , Caenorhabditis elegans , Cluster Analysis , Drosophila melanogaster , Eukaryota , Evolution, Molecular , Gene Regulatory Networks , Humans , Phylogeny , Saccharomyces cerevisiaeABSTRACT
Target recognition by the ubiquitin system is mediated by E3 ubiquitin ligases. Nedd4 family members are E3 ligases comprised of a C2 domain, 2-4 WW domains that bind PY motifs (L/PPxY) and a ubiquitin ligase HECT domain. The nine Nedd4 family proteins in mammals include two close relatives: Nedd4 (Nedd4-1) and Nedd4L (Nedd4-2), but their global substrate recognition or differences in substrate specificity are unknown. We performed in vitro ubiquitylation and binding assays of human Nedd4-1 and Nedd4-2, and rat-Nedd4-1, using protein microarrays spotted with approximately 8200 human proteins. Top hits (substrates) for the ubiquitylation and binding assays mostly contain PY motifs. Although several substrates were recognized by both Nedd4-1 and Nedd4-2, others were specific to only one, with several Tyr kinases preferred by Nedd4-1 and some ion channels by Nedd4-2; this was subsequently validated in vivo. Accordingly, Nedd4-1 knockdown or knockout in cells led to sustained signalling via some of its substrate Tyr kinases (e.g. FGFR), suggesting Nedd4-1 suppresses their signalling. These results demonstrate the feasibility of identifying substrates and deciphering substrate specificity of mammalian E3 ligases.
Subject(s)
Endosomal Sorting Complexes Required for Transport/metabolism , Substrate Specificity , Ubiquitin-Protein Ligases/metabolism , Humans , Nedd4 Ubiquitin Protein Ligases , Protein Array Analysis , Protein Binding , ProteomeABSTRACT
Escherichia coli serves as an excellent model for the study of fundamental cellular processes such as metabolism, signalling and gene expression. Understanding the function and organization of proteins within these processes is an important step towards a 'systems' view of E. coli. Integrating experimental and computational interaction data, we present a reliable network of 3,989 functional interactions between 1,941 E. coli proteins ( approximately 45% of its proteome). These were combined with a recently generated set of 3,888 high-quality physical interactions between 918 proteins and clustered to reveal 316 discrete modules. In addition to known protein complexes (e.g., RNA and DNA polymerases), we identified modules that represent biochemical pathways (e.g., nitrate regulation and cell wall biosynthesis) as well as batteries of functionally and evolutionarily related processes. To aid the interpretation of modular relationships, several case examples are presented, including both well characterized and novel biochemical systems. Together these data provide a global view of the modular organization of the E. coli proteome and yield unique insights into structural and evolutionary relationships in bacterial networks.
Subject(s)
Escherichia coli Proteins/metabolism , Escherichia coli/metabolism , Models, Biological , Multienzyme Complexes/metabolism , Protein Interaction Mapping/methods , Signal Transduction/physiology , Computer SimulationABSTRACT
Understanding the metabolic activity of a microbial community, at both the level of the individual microbe and the whole microbiome, provides fundamental biological, biochemical, and clinical insights into the nature of the microbial community and interactions with their hosts in health and disease. Here, we discuss a method to examine the expression of metabolic pathways in microbial communities using data from metatranscriptomic next-generation sequencing data. The methodology described here encompasses enzyme function annotation, differential enzyme expression and pathway enrichment analyses, and visualization of metabolic networks with differential enzyme expression levels.
Subject(s)
High-Throughput Nucleotide Sequencing/methods , Metabolic Networks and Pathways , Microbiota , Transcriptome , Humans , MetagenomicsABSTRACT
BACKGROUND: Intestinal microbiota are critical determinants of obesity and metabolic disease risk. In previous work, we showed that deletion of the cytoplasmic lipid droplet (CLD) protein perilipin-2 (Plin2) modulates gut microbial community structure and abrogates long-term deleterious effects of a high-fat (HF) diet in mice. However, the impact of Plin2 on microbiome function is unknown. RESULTS: Here, we used metatranscriptomics to identify differences in microbiome transcript expression in WT and Plin2-null mice following acute exposure to high-fat/low-carbohydrate (HF) or low-fat/high-carbohydrate (LF) diets. Consistent with previous studies, dietary changes resulted in significant taxonomic shifts. Unexpectedly, when fed a HF diet, the microbiota of Plin2-null and WT mice exhibited dramatic shifts in transcript expression despite no discernible shift in community structure. For Plin2-null mice, these changes included the coordinated upregulation of metabolic enzymes directing flux towards the production of growth metabolites such as fatty acids, nucleotides, and amino acids. In contrast, the LF diet did not appear to induce the same dramatic changes in transcript or pathway expression between the two genotypes. CONCLUSIONS: Our data shows that a host genotype can modulate microbiome function without impacting community structure and identify Plin2 as a specific host determinant of diet effects on microbial function. Along with uncovering potential mechanisms for integrating how diet modulates host and microbial metabolism, our findings demonstrate the limits of 16S rRNA surveys to inform on community functional activities and the need to prioritize metatranscriptomic studies to gain more meaningful insights into microbiome function.
Subject(s)
Diet, High-Fat , Dietary Fats/metabolism , Gastrointestinal Microbiome , Microbiota/genetics , Perilipin-2/metabolism , Transcriptome , Animals , Fatty Acids/metabolism , Feces/microbiology , Metabolic Networks and Pathways/genetics , Mice , Mice, Inbred C57BL , Mice, Knockout , Perilipin-2/deficiency , Perilipin-2/genetics , RNA, Ribosomal, 16S/geneticsABSTRACT
Mitochondrial protein (MP) dysfunction has been linked to neurodegenerative disorders (NDs); however, the discovery of the molecular mechanisms underlying NDs has been impeded by the limited characterization of interactions governing MP function. Here, using mass spectrometry (MS)-based analysis of 210 affinity-purified mitochondrial (mt) fractions isolated from 27 epitope-tagged human ND-linked MPs in HEK293 cells, we report a high-confidence MP network including 1,964 interactions among 772 proteins (>90% previously unreported). Nearly three-fourths of these interactions were confirmed in mouse brain and multiple human differentiated neuronal cell lines by primary antibody immunoprecipitation and MS, with many linked to NDs and autism. We show that the SOD1-PRDX5 interaction, critical for mt redox homeostasis, can be perturbed by amyotrophic lateral sclerosis-linked SOD1 allelic variants and establish a functional role for ND-linked factors coupled with IκBÉ in NF-κB activation. Our results identify mechanisms for ND-linked MPs and expand the human mt interaction landscape.
Subject(s)
Autistic Disorder/metabolism , Brain/physiology , NF-kappa B/metabolism , Neurodegenerative Diseases/metabolism , Neurons/physiology , Animals , HEK293 Cells , Humans , Mass Spectrometry , Mice , Mitochondria/metabolism , Mitochondrial Proteins/metabolism , Oxidation-Reduction , Protein Interaction MapsABSTRACT
BACKGROUND: Metatranscriptomics is emerging as a powerful technology for the functional characterization of complex microbial communities (microbiomes). Use of unbiased RNA-sequencing can reveal both the taxonomic composition and active biochemical functions of a complex microbial community. However, the lack of established reference genomes, computational tools and pipelines make analysis and interpretation of these datasets challenging. Systematic studies that compare data across microbiomes are needed to demonstrate the ability of such pipelines to deliver biologically meaningful insights on microbiome function. RESULTS: Here, we apply a standardized analytical pipeline to perform a comparative analysis of metatranscriptomic data from diverse microbial communities derived from mouse large intestine, cow rumen, kimchi culture, deep-sea thermal vent and permafrost. Sequence similarity searches allowed annotation of 19 to 76% of putative messenger RNA (mRNA) reads, with the highest frequency in the kimchi dataset due to its relatively low complexity and availability of closely related reference genomes. Metatranscriptomic datasets exhibited distinct taxonomic and functional signatures. From a metabolic perspective, we identified a common core of enzymes involved in amino acid, energy and nucleotide metabolism and also identified microbiome-specific pathways such as phosphonate metabolism (deep sea) and glycan degradation pathways (cow rumen). Integrating taxonomic and functional annotations within a novel visualization framework revealed the contribution of different taxa to metabolic pathways, allowing the identification of taxa that contribute unique functions. CONCLUSIONS: The application of a single, standard pipeline confirms that the rich taxonomic and functional diversity observed across microbiomes is not simply an artefact of different analysis pipelines but instead reflects distinct environmental influences. At the same time, our findings show how microbiome complexity and availability of reference genomes can impact comprehensive annotation of metatranscriptomes. Consequently, beyond the application of standardized pipelines, additional caution must be taken when interpreting their output and performing downstream, microbiome-specific, analyses. The pipeline used in these analyses along with a tutorial has been made freely available for download from our project website: http://www.compsysbio.org/microbiome .
Subject(s)
Metabolic Networks and Pathways/genetics , Metagenome , Microbiota/genetics , Phylogeny , RNA, Bacterial/genetics , RNA, Messenger/genetics , Animals , Brassica/microbiology , Cattle , Fermentation , Gene Ontology , Hydrothermal Vents/microbiology , Intestine, Large/microbiology , Mice , Molecular Sequence Annotation , Permafrost/microbiology , Raphanus/microbiology , Rumen/microbiology , Sequence Analysis, RNAABSTRACT
PhyloPro is a database and accompanying web-based application for the construction and exploration of phylogenetic profiles across the Eukarya. In this update article, we present six major new developments in PhyloPro: (i) integration of Pfam-A domain predictions for all proteins; (ii) new summary heatmaps and detailed level views of domain conservation; (iii) an interactive, network-based visualization tool for exploration of domain architectures and their conservation; (iv) ability to browse based on protein functional categories (GOSlim); (v) improvements to the web interface to enhance drill down capability from the heatmap view; and (vi) improved coverage including 164 eukaryotes and 12 reference species. In addition, we provide improved support for downloading data and images in a variety of formats. Among the existing tools available for phylogenetic profiles, PhyloPro provides several innovative domain-based features including a novel domain adjacency visualization tool. These are designed to allow the user to identify and compare proteins with similar domain architectures across species and thus develop hypotheses about the evolution of lineage-specific trajectories. Database URL: http://www.compsysbio.org/phylopro/.
Subject(s)
Conserved Sequence , Databases, Protein , Eukaryota/metabolism , Phylogeny , Protein Structure, Tertiary , Search Engine , Species SpecificityABSTRACT
Our analysis examines the conservation of multiprotein complexes among metazoa through use of high resolution biochemical fractionation and precision mass spectrometry applied to soluble cell extracts from 5 representative model organisms Caenorhabditis elegans, Drosophila melanogaster, Mus musculus, Strongylocentrotus purpuratus, and Homo sapiens. The interaction network obtained from the data was validated globally in 4 distant species (Xenopus laevis, Nematostella vectensis, Dictyostelium discoideum, Saccharomyces cerevisiae) and locally by targeted affinity-purification experiments. Here we provide details of our massive set of supporting biochemical fractionation data available via ProteomeXchange (PXD002319-PXD002328), PPIs via BioGRID (185267); and interaction network projections via (http://metazoa.med.utoronto.ca) made fully accessible to allow further exploration. The datasets here are related to the research article on metazoan macromolecular complexes in Nature [1].
ABSTRACT
Adherent-invasive Escherichia coli (AIEC) strains are detected more frequently within mucosal lesions of patients with Crohn's disease (CD). The AIEC phenotype consists of adherence and invasion of intestinal epithelial cells and survival within macrophages of these bacteria in vitro. Our aim was to identify candidate transcripts that distinguish AIEC from non-invasive E. coli (NIEC) strains and might be useful for rapid and accurate identification of AIEC by culture-independent technology. We performed comparative RNA-Sequence (RNASeq) analysis using AIEC strain LF82 and NIEC strain HS during exponential and stationary growth. Differential expression analysis of coding sequences (CDS) homologous to both strains demonstrated 224 and 241 genes with increased and decreased expression, respectively, in LF82 relative to HS. Transition metal transport and siderophore metabolism related pathway genes were up-regulated, while glycogen metabolic and oxidation-reduction related pathway genes were down-regulated, in LF82. Chemotaxis related transcripts were up-regulated in LF82 during the exponential phase, but flagellum-dependent motility pathway genes were down-regulated in LF82 during the stationary phase. CDS that mapped only to the LF82 genome accounted for 747 genes. We applied an in silico subtractive genomics approach to identify CDS specific to AIEC by incorporating the genomes of 10 other previously phenotyped NIEC. From this analysis, 166 CDS mapped to the LF82 genome and lacked homology to any of the 11 human NIEC strains. We compared these CDS across 13 AIEC, but none were homologous in each. Four LF82 gene loci belonging to clustered regularly interspaced short palindromic repeats region (CRISPR)--CRISPR-associated (Cas) genes were identified in 4 to 6 AIEC and absent from all non-pathogenic bacteria. As previously reported, AIEC strains were enriched for pdu operon genes. One CDS, encoding an excisionase, was shared by 9 AIEC strains. Reverse transcription quantitative polymerase chain reaction assays for 6 genes were conducted on fecal and ileal RNA samples from 22 inflammatory bowel disease (IBD), and 32 patients without IBD (non-IBD). The expression of Cas loci was detected in a higher proportion of CD than non-IBD fecal and ileal RNA samples (p <0.05). These results support a comparative genomic/transcriptomic approach towards identifying candidate AIEC signature transcripts.