|

1.

Sex-specific developmental gene expression atlas unveils dimorphic gene networks in C. elegans.

Haque, Rizwanul; Kurien, Sonu Peedikayil; Setty, Hagar; Salzberg, Yehuda; Stelzer, Gil; Litvak, Einav; Gingold, Hila; Rechavi, Oded; Oren-Suissa, Meital.

Nat Commun ; 15(1): 4273, 2024 May 20.

Article En | MEDLINE | ID: mdl-38769103

Sex-specific traits and behaviors emerge during development by the acquisition of unique properties in the nervous system of each sex. However, the genetic events responsible for introducing these sex-specific features remain poorly understood. In this study, we create a comprehensive gene expression atlas of pure populations of hermaphrodites and males of the nematode Caenorhabditis elegans across development. We discover numerous differentially expressed genes, including neuronal gene families like transcription factors, neuropeptides, and G protein-coupled receptors. We identify INS-39, an insulin-like peptide, as a prominent male-biased gene expressed specifically in ciliated sensory neurons. We show that INS-39 serves as an early-stage male marker, facilitating the effective isolation of males in high-throughput experiments. Through complex and sex-specific regulation, ins-39 plays pleiotropic sexually dimorphic roles in various behaviors, while also playing a shared, dimorphic role in early life stress. This study offers a comparative sexual and developmental gene expression database for C. elegans. Furthermore, it highlights conserved genes that may underlie the sexually dimorphic manifestation of different human diseases.

Caenorhabditis elegans Proteins , Caenorhabditis elegans , Gene Expression Regulation, Developmental , Gene Regulatory Networks , Sex Characteristics , Animals , Caenorhabditis elegans/genetics , Caenorhabditis elegans/growth & development , Caenorhabditis elegans/metabolism , Male , Female , Caenorhabditis elegans Proteins/genetics , Caenorhabditis elegans Proteins/metabolism , Neuropeptides/genetics , Neuropeptides/metabolism , Sensory Receptor Cells/metabolism , Transcription Factors/metabolism , Transcription Factors/genetics , Gene Expression Profiling

2.

Sex shapes cell-type-specific transcriptional signatures of stress exposure in the mouse hypothalamus.

Brivio, Elena; Kos, Aron; Ulivi, Alessandro Francesco; Karamihalev, Stoyo; Ressle, Andrea; Stoffel, Rainer; Hirsch, Dana; Stelzer, Gil; Schmidt, Mathias V; Lopez, Juan Pablo; Chen, Alon.

Cell Rep ; 42(8): 112874, 2023 08 29.

Article En | MEDLINE | ID: mdl-37516966

Stress-related psychiatric disorders and the stress system show prominent differences between males and females, as well as strongly divergent transcriptional changes. Despite several proposed mechanisms, we still lack the understanding of the molecular processes at play. Here, we explore the contribution of cell types to transcriptional sex dimorphism using single-cell RNA sequencing. We identify cell-type-specific signatures of acute restraint stress in the paraventricular nucleus of the hypothalamus, a central hub of the stress response, in male and female mice. Further, we show that a history of chronic mild stress alters these signatures in a sex-specific way, and we identify oligodendrocytes as a major target for these sex-specific effects. This dataset, which we provide as an online interactive app, offers the transcriptomes of thousands of individual cells as a molecular resource for an in-depth dissection of the interplay between cell types and sex on the mechanisms of the stress response.

Sex Characteristics , Stress, Psychological , Mice , Male , Female , Animals , Stress, Psychological/metabolism , Hypothalamus

3.

Inhibitors of eIF4G1-eIF1 uncover its regulatory role of ER/UPR stress-response genes independent of eIF2α-phosphorylation.

Sehrawat, Urmila; Haimov, Ora; Weiss, Benjamin; Tamarkin-Ben Harush, Ana; Ashkenazi, Shaked; Plotnikov, Alexander; Noiman, Tzahi; Leshkowitz, Dena; Stelzer, Gil; Dikstein, Rivka.

Proc Natl Acad Sci U S A ; 119(30): e2120339119, 2022 07 26.

Article En | MEDLINE | ID: mdl-35857873

During translation initiation, eIF4G1 dynamically interacts with eIF4E and eIF1. While the role of eIF4E-eIF4G1 is well established, the regulatory functions of eIF4G1-eIF1 are poorly understood. Here, we report the identification of the eIF4G1-eIF1 inhibitors i14G1-10 and i14G1-12. i14G1s directly bind eIF4G1 and inhibit translation in vitro and in the cell, and their effects on translation are dependent on eIF4G1 levels. Translatome analyses revealed that i14G1s mimic eIF1 and eIF4G1 perturbations on the stringency of start codon selection and the opposing roles of eIF1-eIF4G1 in scanning-dependent and scanning-independent short 5' untranslated region (UTR) translation. Remarkably, i14G1s activate ER/unfolded protein response (UPR) stress-response genes via enhanced ribosome loading, elevated 5'UTR translation at near-cognate AUGs, and unexpected concomitant up-regulation of coding-region translation. These effects are, at least in part, independent of eIF2α-phosphorylation. Interestingly, eIF4G1-eIF1 interaction itself is negatively regulated by ER stress and mTOR inhibition. Thus, i14G1s uncover an unknown mechanism of ER/UPR translational stress response and are valuable research tools and potential drugs against diseases exhibiting dysregulated translation.

Endoplasmic Reticulum Stress , Eukaryotic Initiation Factor-2 , Eukaryotic Initiation Factor-4G , Eukaryotic Initiation Factors , Neoplasm Proteins , Nerve Tissue Proteins , Unfolded Protein Response , Animals , Codon, Initiator , Endoplasmic Reticulum Stress/genetics , Eukaryotic Initiation Factor-2/metabolism , Eukaryotic Initiation Factor-4G/antagonists & inhibitors , Eukaryotic Initiation Factor-4G/metabolism , Eukaryotic Initiation Factors/antagonists & inhibitors , Eukaryotic Initiation Factors/metabolism , Humans , Mice , Neoplasm Proteins/antagonists & inhibitors , Neoplasm Proteins/metabolism , Nerve Tissue Proteins/antagonists & inhibitors , Nerve Tissue Proteins/metabolism , Phosphorylation , Protein Biosynthesis , Unfolded Protein Response/genetics

4.

Defining murine monocyte differentiation into colonic and ileal macrophages.

Gross-Vered, Mor; Trzebanski, Sébastien; Shemer, Anat; Bernshtein, Biana; Curato, Caterina; Stelzer, Gil; Salame, Tomer-Meir; David, Eyal; Boura-Halfon, Sigalit; Chappell-Maor, Louise; Leshkowitz, Dena; Jung, Steffen.

Elife ; 92020 01 08.

Article En | MEDLINE | ID: mdl-31916932

Monocytes are circulating short-lived macrophage precursors that are recruited on demand from the blood to sites of inflammation and challenge. In steady state, classical monocytes give rise to vasculature-resident cells that patrol the luminal side of the endothelium. In addition, classical monocytes feed macrophage compartments of selected organs, including barrier tissues, such as the skin and intestine, as well as the heart. Monocyte differentiation under conditions of inflammation has been studied in considerable detail. In contrast, monocyte differentiation under non-inflammatory conditions remains less well understood. Here we took advantage of a combination of cell ablation and precursor engraftment to investigate the generation of gut macrophages from monocytes. Collectively, we identify factors associated with the gradual adaptation of monocytes to tissue residency. Moreover, comparison of monocyte differentiation into the colon and ileum-resident macrophages revealed the graduated acquisition of gut segment-specific gene expression signatures.

Cell Differentiation , Colon/physiology , Ileum/physiology , Macrophages/metabolism , Monocytes/cytology , Animals , Mice , Specific Pathogen-Free Organisms

5.

UTAP: User-friendly Transcriptome Analysis Pipeline.

Kohen, Refael; Barlev, Jonathan; Hornung, Gil; Stelzer, Gil; Feldmesser, Ester; Kogan, Kiril; Safran, Marilyn; Leshkowitz, Dena.

BMC Bioinformatics ; 20(1): 154, 2019 Mar 25.

Article En | MEDLINE | ID: mdl-30909881

BACKGROUND: RNA-Seq technology is routinely used to characterize the transcriptome, and to detect gene expression differences among cell types, genotypes and conditions. Advances in short-read sequencing instruments such as Illumina Next-Seq have yielded easy-to-operate machines, with high throughput, at a lower price per base. However, processing this data requires bioinformatics expertise to tailor and execute specific solutions for each type of library preparation. RESULTS: In order to enable fast and user-friendly data analysis, we developed an intuitive and scalable transcriptome pipeline that executes the full process, starting from cDNA sequences derived by RNA-Seq [Nat Rev Genet 10:57-63, 2009] and bulk MARS-Seq [Science 343:776-779, 2014] and ending with sets of differentially expressed genes. Output files are placed in structured folders, and results summaries are provided in rich and comprehensive reports, containing dozens of plots, tables and links. CONCLUSION: Our User-friendly Transcriptome Analysis Pipeline (UTAP) is an open source, web-based intuitive platform available to the biomedical research community, enabling researchers to efficiently and accurately analyse transcriptome sequence data.

Gene Expression Profiling/methods , High-Throughput Nucleotide Sequencing/methods , Sequence Analysis, RNA/methods , Software

6.

Erratum for Sehrawat et al., "Cancer-Associated Eukaryotic Translation Initiation Factor 1A Mutants Impair Rps3 and Rps10 Binding and Enhance Scanning of Cell Cycle Genes".

Sehrawat, Urmila; Koning, Femke; Ashkenazi, Shaked; Stelzer, Gil; Leshkowitz, Dena; Dikstein, Rivka.

Mol Cell Biol ; 39(6)2019 Mar 15.

Article En | MEDLINE | ID: mdl-30824664

7.

Cancer-Associated Eukaryotic Translation Initiation Factor 1A Mutants Impair Rps3 and Rps10 Binding and Enhance Scanning of Cell Cycle Genes.

Sehrawat, Urmila; Koning, Femke; Ashkenazi, Shaked; Stelzer, Gil; Leshkowitz, Dena; Dikstein, Rivka.

Mol Cell Biol ; 39(3)2019 02 01.

Article En | MEDLINE | ID: mdl-30420357

Protein synthesis is linked to cell proliferation, and its deregulation contributes to cancer. Eukaryotic translation initiation factor 1A (eIF1A) plays a key role in scanning and AUG selection and differentially affects the translation of distinct mRNAs. Its unstructured N-terminal tail (NTT) is frequently mutated in several malignancies. Here we report that eIF1A is essential for cell proliferation and cell cycle progression. Ribosome profiling of eIF1A knockdown cells revealed a substantial enrichment of cell cycle mRNAs among the downregulated genes, which are predominantly characterized by a lengthy 5' untranslated region (UTR). Conversely, eIF1A depletion caused a broad stimulation of 5' UTR initiation at a near cognate AUG, unveiling a prominent role of eIF1A in suppressing 5' UTR translation. In addition, the AUG context-dependent autoregulation of eIF1 was disrupted by eIF1A depletion, suggesting their cooperation in AUG context discrimination and scanning. Importantly, cancer-associated eIF1A NTT mutants augmented the eIF1A positive effect on a long 5' UTR, while they hardly affected AUG selection. Mechanistically, these mutations diminished the eIF1A interaction with Rps3 and Rps10 implicated in scanning arrest. Our findings suggest that the reduced binding of eIF1A NTT mutants to the ribosome retains its open state and facilitates scanning of long 5' UTR-containing cell cycle genes.

Eukaryotic Initiation Factor-1/genetics , Eukaryotic Initiation Factor-1/metabolism , Ribosomal Proteins/metabolism , 5' Untranslated Regions , Animals , Cell Cycle Checkpoints/physiology , Cell Proliferation/physiology , Codon, Initiator , Fibroblasts , HEK293 Cells , Humans , Mice , Mouse Embryonic Stem Cells , Mutation , Neoplasms/genetics , Protein Binding , Protein Biosynthesis , RNA, Messenger/genetics , RNA, Messenger/metabolism , Ribosomal Proteins/genetics , Ribosomes/metabolism

8.

Induction of CD4 T cell memory by local cellular collectivity.

Polonsky, Michal; Rimer, Jacob; Kern-Perets, Amos; Zaretsky, Irina; Miller, Stav; Bornstein, Chamutal; David, Eyal; Kopelman, Naama Meira; Stelzer, Gil; Porat, Ziv; Chain, Benjamin; Friedman, Nir.

Science ; 360(6394)2018 06 15.

Article En | MEDLINE | ID: mdl-29903938

Cell differentiation is directed by signals driving progenitors into specialized cell types. This process can involve collective decision-making, when differentiating cells determine their lineage choice by interacting with each other. We used live-cell imaging in microwell arrays to study collective processes affecting differentiation of naïve CD4+ T cells into memory precursors. We found that differentiation of precursor memory T cells sharply increases above a threshold number of locally interacting cells. These homotypic interactions involve the cytokines interleukin-2 (IL-2) and IL-6, which affect memory differentiation orthogonal to their effect on proliferation and survival. Mathematical modeling suggests that the differentiation rate is continuously modulated by the instantaneous number of locally interacting cells. This cellular collectivity can prioritize allocation of immune memory to stronger responses.

CD4-Positive T-Lymphocytes/cytology , CD4-Positive T-Lymphocytes/immunology , Cell Differentiation/immunology , Immunologic Memory , Quorum Sensing/immunology , Animals , CD4 Lymphocyte Count , Cell Differentiation/genetics , Computer Simulation , Gene Expression , Interleukin-2/genetics , Interleukin-2/immunology , Interleukin-6/genetics , Interleukin-6/immunology , L-Selectin/genetics , L-Selectin/immunology , Mice , Mice, Inbred C57BL , Mice, Transgenic , Models, Immunological , Sequence Analysis, RNA , Signaling Lymphocytic Activation Molecule Family/immunology

9.

Elucidating tissue specific genes using the Benford distribution.

Karthik, Deepak; Stelzer, Gil; Gershanov, Sivan; Baranes, Danny; Salmon-Divon, Mali.

BMC Genomics ; 17: 595, 2016 08 09.

Article En | MEDLINE | ID: mdl-27506195

BACKGROUND: The RNA-seq technique is applied for the investigation of transcriptional behaviour. The reduction in sequencing costs has led to an unprecedented trove of gene expression data from diverse biological systems. Subsequently, principles from other disciplines such as the Benford law, which can be properly judged only in data-rich systems, can now be examined on this high-throughput transcriptomic information. The Benford law, states that in many count-rich datasets the distribution of the first significant digit is not uniform but rather logarithmic. RESULTS: All tested digital gene expression datasets showed a Benford-like distribution when observing an entire gene set. This phenomenon was conserved in development and does not demonstrate tissue specificity. However, when obedience to the Benford law is calculated for individual expressed genes across thousands of cells, genes that best and least adhere to the Benford law are enriched with tissue specific or cell maintenance descriptors, respectively. Surprisingly, a positive correlation was found between the obedience a gene exhibits to the Benford law and its expression level, despite the former being calculated solely according to first digit frequency while totally ignoring the expression value itself. Nevertheless, genes with low expression that exhibit Benford behavior demonstrate tissue specific associations. These observations were extended to predict the likelihood of tissue specificity based on Benford behaviour in a supervised learning approach. CONCLUSIONS: These results demonstrate the applicability and potential predictability of the Benford law for gleaning biological insight from simple count data.

Gene Expression Profiling , Models, Statistical , Transcriptome , Computer Simulation , Databases, Genetic , Genes, Essential , High-Throughput Nucleotide Sequencing , Humans , Organ Specificity/genetics , Single-Cell Analysis

10.

VarElect: the phenotype-based variation prioritizer of the GeneCards Suite.

Stelzer, Gil; Plaschkes, Inbar; Oz-Levi, Danit; Alkelai, Anna; Olender, Tsviya; Zimmerman, Shahar; Twik, Michal; Belinky, Frida; Fishilevich, Simon; Nudel, Ron; Guan-Golan, Yaron; Warshawsky, David; Dahary, Dvir; Kohn, Asher; Mazor, Yaron; Kaplan, Sergey; Iny Stein, Tsippi; Baris, Hagit N; Rappaport, Noa; Safran, Marilyn; Lancet, Doron.

BMC Genomics ; 17 Suppl 2: 444, 2016 06 23.

Article En | MEDLINE | ID: mdl-27357693

BACKGROUND: Next generation sequencing (NGS) provides a key technology for deciphering the genetic underpinnings of human diseases. Typical NGS analyses of a patient depict tens of thousands non-reference coding variants, but only one or very few are expected to be significant for the relevant disorder. In a filtering stage, one employs family segregation, rarity in the population, predicted protein impact and evolutionary conservation as a means for shortening the variation list. However, narrowing down further towards culprit disease genes usually entails laborious seeking of gene-phenotype relationships, consulting numerous separate databases. Thus, a major challenge is to transition from the few hundred shortlisted genes to the most viable disease-causing candidates. RESULTS: We describe a novel tool, VarElect ( http://ve.genecards.org ), a comprehensive phenotype-dependent variant/gene prioritizer, based on the widely-used GeneCards, which helps rapidly identify causal mutations with extensive evidence. The GeneCards suite offers an effective and speedy alternative, whereby >120 gene-centric automatically-mined data sources are jointly available for the task. VarElect cashes on this wealth of information, as well as on GeneCards' powerful free-text Boolean search and scoring capabilities, proficiently matching variant-containing genes to submitted disease/symptom keywords. The tool also leverages the rich disease and pathway information of MalaCards, the human disease database, and PathCards, the unified pathway (SuperPaths) database, both within the GeneCards Suite. The VarElect algorithm infers direct as well as indirect links between genes and phenotypes, the latter benefitting from GeneCards' diverse gene-to-gene data links in GenesLikeMe. Finally, our tool offers an extensive gene-phenotype evidence portrayal ("MiniCards") and hyperlinks to the parent databases. CONCLUSIONS: We demonstrate that VarElect compares favorably with several often-used NGS phenotyping tools, thus providing a robust facility for ranking genes, pointing out their likelihood to be related to a patient's disease. VarElect's capacity to automatically process numerous NGS cases, either in stand-alone format or in VCF-analyzer mode (TGex and VarAnnot), is indispensable for emerging clinical projects that involve thousands of whole exome/genome NGS analyses.

Computational Biology/methods , High-Throughput Nucleotide Sequencing/methods , Algorithms , Data Mining , Databases, Genetic , Genome, Human , Humans , Phenotype

11.

The GeneCards Suite: From Gene Data Mining to Disease Genome Sequence Analyses.

Stelzer, Gil; Rosen, Naomi; Plaschkes, Inbar; Zimmerman, Shahar; Twik, Michal; Fishilevich, Simon; Stein, Tsippi Iny; Nudel, Ron; Lieder, Iris; Mazor, Yaron; Kaplan, Sergey; Dahary, Dvir; Warshawsky, David; Guan-Golan, Yaron; Kohn, Asher; Rappaport, Noa; Safran, Marilyn; Lancet, Doron.

Curr Protoc Bioinformatics ; 54: 1.30.1-1.30.33, 2016 06 20.

Article En | MEDLINE | ID: mdl-27322403

GeneCards, the human gene compendium, enables researchers to effectively navigate and inter-relate the wide universe of human genes, diseases, variants, proteins, cells, and biological pathways. Our recently launched Version 4 has a revamped infrastructure facilitating faster data updates, better-targeted data queries, and friendlier user experience. It also provides a stronger foundation for the GeneCards suite of companion databases and analysis tools. Improved data unification includes gene-disease links via MalaCards and merged biological pathways via PathCards, as well as drug information and proteome expression. VarElect, another suite member, is a phenotype prioritizer for next-generation sequencing, leveraging the GeneCards and MalaCards knowledgebase. It automatically infers direct and indirect scored associations between hundreds or even thousands of variant-containing genes and disease phenotype terms. VarElect's capabilities, either independently or within TGex, our comprehensive variant analysis pipeline, help prepare for the challenge of clinical projects that involve thousands of exome/genome NGS analyses. © 2016 by John Wiley & Sons, Inc.

Data Mining/methods , Databases, Genetic , Genomics/methods , Sequence Analysis/methods , High-Throughput Nucleotide Sequencing , Humans , Phenotype , Proteome , Software/standards

12.

GeneAnalytics: An Integrative Gene Set Analysis Tool for Next Generation Sequencing, RNAseq and Microarray Data.

Ben-Ari Fuchs, Shani; Lieder, Iris; Stelzer, Gil; Mazor, Yaron; Buzhor, Ella; Kaplan, Sergey; Bogoch, Yoel; Plaschkes, Inbar; Shitrit, Alina; Rappaport, Noa; Kohn, Asher; Edgar, Ron; Shenhav, Liraz; Safran, Marilyn; Lancet, Doron; Guan-Golan, Yaron; Warshawsky, David; Shtrichman, Ronit.

OMICS ; 20(3): 139-51, 2016 Mar.

Article En | MEDLINE | ID: mdl-26983021

Postgenomics data are produced in large volumes by life sciences and clinical applications of novel omics diagnostics and therapeutics for precision medicine. To move from "data-to-knowledge-to-innovation," a crucial missing step in the current era is, however, our limited understanding of biological and clinical contexts associated with data. Prominent among the emerging remedies to this challenge are the gene set enrichment tools. This study reports on GeneAnalytics™ ( geneanalytics.genecards.org ), a comprehensive and easy-to-apply gene set analysis tool for rapid contextualization of expression patterns and functional signatures embedded in the postgenomics Big Data domains, such as Next Generation Sequencing (NGS), RNAseq, and microarray experiments. GeneAnalytics' differentiating features include in-depth evidence-based scoring algorithms, an intuitive user interface and proprietary unified data. GeneAnalytics employs the LifeMap Science's GeneCards suite, including the GeneCards®--the human gene database; the MalaCards-the human diseases database; and the PathCards--the biological pathways database. Expression-based analysis in GeneAnalytics relies on the LifeMap Discovery®--the embryonic development and stem cells database, which includes manually curated expression data for normal and diseased tissues, enabling advanced matching algorithm for gene-tissue association. This assists in evaluating differentiation protocols and discovering biomarkers for tissues and cells. Results are directly linked to gene, disease, or cell "cards" in the GeneCards suite. Future developments aim to enhance the GeneAnalytics algorithm as well as visualizations, employing varied graphical display items. Such attributes make GeneAnalytics a broadly applicable postgenomics data analyses and interpretation tool for translation of data to knowledge-based innovation in various Big Data fields such as precision medicine, ecogenomics, nutrigenomics, pharmacogenomics, vaccinomics, and others yet to emerge on the postgenomics horizon.

Computational Biology/methods , Gene Regulatory Networks , Genome, Human , High-Throughput Nucleotide Sequencing/statistics & numerical data , Microarray Analysis/statistics & numerical data , Software , Algorithms , Data Mining , Databases, Factual , Databases, Genetic , Humans , Metabolic Networks and Pathways/genetics

13.

Molecular disease presentation in diabetic nephropathy.

Heinzel, Andreas; Mühlberger, Irmgard; Stelzer, Gil; Lancet, Doron; Oberbauer, Rainer; Martin, Maria; Perco, Paul.

Nephrol Dial Transplant ; 30 Suppl 4: iv17-25, 2015 Aug.

Article En | MEDLINE | ID: mdl-26209734

Diabetic nephropathy, as the most prevalent chronic disease of the kidney, has also become the primary cause of end-stage renal disease with the incidence of kidney disease in type 2 diabetics continuously rising. As with most chronic diseases, the pathophysiology is multifactorial with a number of deregulated molecular processes contributing to disease manifestation and progression. Current therapy mainly involves interfering in the renin-angiotensin-aldosterone system using angiotensin-converting enzyme inhibitors or angiotensin-receptor blockers. Better understanding of molecular processes deregulated in the early stages and progression of disease hold the key for development of novel therapeutics addressing this complex disease. With the advent of high-throughput omics technologies, researchers set out to systematically study the disease on a molecular level. Results of the first omics studies were mainly focused on reporting the highest deregulated molecules between diseased and healthy subjects with recent attempts to integrate findings of multiple studies on the level of molecular pathways and processes. In this review, we will outline key omics studies on the genome, transcriptome, proteome and metabolome level in the context of DN. We will also provide concepts on how to integrate findings of these individual studies (i) on the level of functional processes using the gene-ontology vocabulary, (ii) on the level of molecular pathways and (iii) on the level of phenotype molecular models constructed based on protein-protein interaction data.

Biomarkers/analysis , Diabetic Nephropathies/diagnosis , Chronic Disease , Diabetic Nephropathies/metabolism , Disease Progression , Humans

14.

PathCards: multi-source consolidation of human biological pathways.

Belinky, Frida; Nativ, Noam; Stelzer, Gil; Zimmerman, Shahar; Iny Stein, Tsippi; Safran, Marilyn; Lancet, Doron.

Database (Oxford) ; 20152015.

Article En | MEDLINE | ID: mdl-25725062

The study of biological pathways is key to a large number of systems analyses. However, many relevant tools consider a limited number of pathway sources, missing out on many genes and gene-to-gene connections. Simply pooling several pathways sources would result in redundancy and the lack of systematic pathway interrelations. To address this, we exercised a combination of hierarchical clustering and nearest neighbor graph representation, with judiciously selected cutoff values, thereby consolidating 3215 human pathways from 12 sources into a set of 1073 SuperPaths. Our unification algorithm finds a balance between reducing redundancy and optimizing the level of pathway-related informativeness for individual genes. We show a substantial enhancement of the SuperPaths' capacity to infer gene-to-gene relationships when compared with individual pathway sources, separately or taken together. Further, we demonstrate that the chosen 12 sources entail nearly exhaustive gene coverage. The computed SuperPaths are presented in a new online database, PathCards, showing each SuperPath, its constituent network of pathways, and its contained genes. This provides researchers with a rich, searchable systems analysis resource. Database URL: http://pathcards.genecards.org/

Biosynthetic Pathways/physiology , Databases, Genetic , Epistasis, Genetic/physiology , Gene Regulatory Networks/physiology , Humans

15.

VennBLASTwhole transcriptome comparison and visualization tool.

Zahavi, Tamar; Stelzer, Gil; Strauss, Lior; Salmon, Asher Y; Salmon-Divon, Mali.

Genomics ; 105(3): 131-6, 2015 Mar.

Article En | MEDLINE | ID: mdl-25535680

RNA-seq is the method of choice for getting a primary list of genes for non-model organisms. Once this is achieved, one would proceed to annotate the newly discovered genes and consequently strive to position the organism in an evolutionary context. These kinds of studies involving high-throughput sequencing generate large amounts of data, whose analysis might be time consuming for the non-specialist user and merit computational skills. Here we describe VennBLAST, a set of high-performance utilities that combines fast parallelized BLAST filtering with a visualization tool for whole-transcriptomic alignment comparison using Venn diagrams. The software accurately illustrates simple set relationships between numbers of matching sequences and identifies transcriptome conservation among different organisms. The intuitive Venn diagram visualization allows researchers to easily select a desired subset of genes for further inspection, using the DAVID functional annotation tools, for instance, which enables investigators to understand biological meaning behind large lists of genes.

Gene Expression Profiling/methods , Sequence Alignment , Sequence Analysis, RNA , Software , Animals , Anthozoa/genetics , Genomics/methods , Transcriptome

16.

MalaCards: A Comprehensive Automatically-Mined Database of Human Diseases.

Rappaport, Noa; Twik, Michal; Nativ, Noam; Stelzer, Gil; Bahir, Iris; Stein, Tsippi Iny; Safran, Marilyn; Lancet, Doron.

Curr Protoc Bioinformatics ; 47: 1.24.1-19, 2014 Sep 08.

Article En | MEDLINE | ID: mdl-25199789

Systems medicine provides insights into mechanisms of human diseases, and expedites the development of better diagnostics and drugs. To facilitate such strategies, we initiated MalaCards, a compendium of human diseases and their annotations, integrating and often remodeling information from 64 data sources. MalaCards employs, among others, the proven automatic data-mining strategies established in the construction of GeneCards, our widely used compendium of human genes. The development of MalaCards poses many algorithmic challenges, such as disease name unification, integrated classification, gene-disease association, and disease-targeted expression analysis. MalaCards displays a Web card for each of >19,000 human diseases, with 17 sections, including textual summaries, related diseases, related genes, genetic variations and tests, and relevant publications. Also included are a powerful search engine and a variety of categorized disease lists. This unit describes two basic protocols to search and browse MalaCards effectively.

Automation , Data Mining , Database Management Systems , Disease , Humans , User-Computer Interface

17.

MalaCards: an integrated compendium for diseases and their annotation.

Rappaport, Noa; Nativ, Noam; Stelzer, Gil; Twik, Michal; Guan-Golan, Yaron; Stein, Tsippi Iny; Bahir, Iris; Belinky, Frida; Morrey, C Paul; Safran, Marilyn; Lancet, Doron.

Database (Oxford) ; 2013: bat018, 2013.

Article En | MEDLINE | ID: mdl-23584832

Comprehensive disease classification, integration and annotation are crucial for biomedical discovery. At present, disease compilation is incomplete, heterogeneous and often lacking systematic inquiry mechanisms. We introduce MalaCards, an integrated database of human maladies and their annotations, modeled on the architecture and strategy of the GeneCards database of human genes. MalaCards mines and merges 44 data sources to generate a computerized card for each of 16 919 human diseases. Each MalaCard contains disease-specific prioritized annotations, as well as inter-disease connections, empowered by the GeneCards relational database, its searches and GeneDecks set analyses. First, we generate a disease list from 15 ranked sources, using disease-name unification heuristics. Next, we use four schemes to populate MalaCards sections: (i) directly interrogating disease resources, to establish integrated disease names, synonyms, summaries, drugs/therapeutics, clinical features, genetic tests and anatomical context; (ii) searching GeneCards for related publications, and for associated genes with corresponding relevance scores; (iii) analyzing disease-associated gene sets in GeneDecks to yield affiliated pathways, phenotypes, compounds and GO terms, sorted by a composite relevance score and presented with GeneCards links; and (iv) searching within MalaCards itself, e.g. for additional related diseases and anatomical context. The latter forms the basis for the construction of a disease network, based on shared MalaCards annotations, embodying associations based on etiology, clinical features and clinical conditions. This broadly disposed network has a power-law degree distribution, suggesting that this might be an inherent property of such networks. Work in progress includes hierarchical malady classification, ontological mapping and disease set analyses, striving to make MalaCards an even more effective tool for biomedical research. Database URL: http://www.malacards.org/

Databases, Genetic , Disease/genetics , Molecular Sequence Annotation , Data Mining , Humans , Internet

18.

Non-redundant compendium of human ncRNA genes in GeneCards.

Belinky, Frida; Bahir, Iris; Stelzer, Gil; Zimmerman, Shahar; Rosen, Naomi; Nativ, Noam; Dalah, Irina; Iny Stein, Tsippi; Rappaport, Noa; Mituyama, Toutai; Safran, Marilyn; Lancet, Doron.

Bioinformatics ; 29(2): 255-61, 2013 Jan 15.

Article En | MEDLINE | ID: mdl-23172862

MOTIVATION: Non-coding RNA (ncRNA) genes are increasingly acknowledged for their importance in the human genome. However, there is no comprehensive non-redundant database for all such human genes. RESULTS: We leveraged the effective platform of GeneCards, the human gene compendium, together with the power of fRNAdb and additional primary sources, to judiciously unify all ncRNA gene entries obtainable from 15 different primary sources. Overlapping entries were clustered to unified locations based on an algorithm employing genomic coordinates. This allowed GeneCards' gamut of relevant entries to rise â¼5-fold, resulting in â¼80,000 human non-redundant ncRNAs, belonging to 14 classes. Such 'grand unification' within a regularly updated data structure will assist future ncRNA research. AVAILABILITY AND IMPLEMENTATION: All of these non-coding RNAs are included among the â¼122,500 entries in GeneCards V3.09, along with pertinent annotation, automatically mined by its built-in pipeline from 100 data sources. This information is available at www.genecards.org. CONTACT: Frida.Belinky@weizmann.ac.il SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Databases, Genetic , RNA, Untranslated/genetics , Algorithms , Cluster Analysis , Genes , Genome, Human , Genomics , Humans , Internet , Molecular Sequence Annotation

19.

In-silico human genomics with GeneCards.

Stelzer, Gil; Dalah, Irina; Stein, Tsippi Iny; Satanower, Yigeal; Rosen, Naomi; Nativ, Noam; Oz-Levi, Danit; Olender, Tsviya; Belinky, Frida; Bahir, Iris; Krug, Hagit; Perco, Paul; Mayer, Bernd; Kolker, Eugene; Safran, Marilyn; Lancet, Doron.

Hum Genomics ; 5(6): 709-17, 2011 Oct.

Article En | MEDLINE | ID: mdl-22155609

Since 1998, the bioinformatics, systems biology, genomics and medical communities have enjoyed a synergistic relationship with the GeneCards database of human genes (http://www.genecards.org). This human gene compendium was created to help to introduce order into the increasing chaos of information flow. As a consequence of viewing details and deep links related to specific genes, users have often requested enhanced capabilities, such that, over time, GeneCards has blossomed into a suite of tools (including GeneDecks, GeneALaCart, GeneLoc, GeneNote and GeneAnnot) for a variety of analyses of both single human genes and sets thereof. In this paper, we focus on inhouse and external research activities which have been enabled, enhanced, complemented and, in some cases, motivated by GeneCards. In turn, such interactions have often inspired and propelled improvements in GeneCards. We describe here the evolution and architecture of this project, including examples of synergistic applications in diverse areas such as synthetic lethality in cancer, the annotation of genetic variations in disease, omics integration in a systems biology approach to kidney disease, and bioinformatics tools.

Databases, Genetic , Genes/genetics , Genome, Human , Genomics , Computational Biology , Humans

20.

Mapping of molecular pathways, biomarkers and drug targets for diabetic nephropathy.

Fechete, Raul; Heinzel, Andreas; Perco, Paul; Mönks, Konrad; Söllner, Johannes; Stelzer, Gil; Eder, Susanne; Lancet, Doron; Oberbauer, Rainer; Mayer, Gert; Mayer, Bernd.

Proteomics Clin Appl ; 5(5-6): 354-66, 2011 Jun.

Article En | MEDLINE | ID: mdl-21491608

PURPOSE: For diseases with complex phenotype such as diabetic nephropathy (DN), integration of multiple Omics sources promises an improved description of the disease pathophysiology, being the basis for novel diagnostics and therapy, but equally important personalization aspects. EXPERIMENTAL DESIGN: Molecular features on DN were retrieved from public domain Omics studies and by mining scientific literature, patent text and clinical trial specifications. Molecular feature sets were consolidated on a human protein interaction network and interpreted on the level of molecular pathways in the light of the pathophysiology of the disease and its clinical context defined as associated biomarkers and drug targets. RESULTS: About 1000 gene symbols each could be assigned to the pathophysiological description of DN and to the clinical context. Direct feature comparison showed minor overlap, whereas on the level of molecular pathways, the complement and coagulation cascade, PPAR signaling, and the renin-angiotensin system linked the disease descriptor space with biomarkers and targets. CONCLUSION AND CLINICAL RELEVANCE: Only the combined molecular feature landscapes closely reflect the clinical implications of DN in the context of hypertension and diabetes. Omics data integration on the level of interaction networks furthermore provides a platform for identification of pathway-specific biomarkers and therapy options.

Computational Biology/methods , Diabetic Nephropathies/drug therapy , Diabetic Nephropathies/metabolism , Drug Delivery Systems , Biomarkers/metabolism , Case-Control Studies , Data Mining , Diabetic Nephropathies/diagnosis , Humans , Prognosis , Protein Interaction Mapping , Proteomics