Pesquisa | Secretaria de Estado da Saúde

VarElect: the phenotype-based variation prioritizer of the GeneCards Suite.

Stelzer, Gil; Plaschkes, Inbar; Oz-Levi, Danit; Alkelai, Anna; Olender, Tsviya; Zimmerman, Shahar; Twik, Michal; Belinky, Frida; Fishilevich, Simon; Nudel, Ron; Guan-Golan, Yaron; Warshawsky, David; Dahary, Dvir; Kohn, Asher; Mazor, Yaron; Kaplan, Sergey; Iny Stein, Tsippi; Baris, Hagit N; Rappaport, Noa; Safran, Marilyn; Lancet, Doron.

BMC Genomics ; 17 Suppl 2: 444, 2016 06 23.

Artigo em Inglês | MEDLINE | ID: mdl-27357693

RESUMO

BACKGROUND: Next generation sequencing (NGS) provides a key technology for deciphering the genetic underpinnings of human diseases. Typical NGS analyses of a patient depict tens of thousands non-reference coding variants, but only one or very few are expected to be significant for the relevant disorder. In a filtering stage, one employs family segregation, rarity in the population, predicted protein impact and evolutionary conservation as a means for shortening the variation list. However, narrowing down further towards culprit disease genes usually entails laborious seeking of gene-phenotype relationships, consulting numerous separate databases. Thus, a major challenge is to transition from the few hundred shortlisted genes to the most viable disease-causing candidates. RESULTS: We describe a novel tool, VarElect ( http://ve.genecards.org ), a comprehensive phenotype-dependent variant/gene prioritizer, based on the widely-used GeneCards, which helps rapidly identify causal mutations with extensive evidence. The GeneCards suite offers an effective and speedy alternative, whereby >120 gene-centric automatically-mined data sources are jointly available for the task. VarElect cashes on this wealth of information, as well as on GeneCards' powerful free-text Boolean search and scoring capabilities, proficiently matching variant-containing genes to submitted disease/symptom keywords. The tool also leverages the rich disease and pathway information of MalaCards, the human disease database, and PathCards, the unified pathway (SuperPaths) database, both within the GeneCards Suite. The VarElect algorithm infers direct as well as indirect links between genes and phenotypes, the latter benefitting from GeneCards' diverse gene-to-gene data links in GenesLikeMe. Finally, our tool offers an extensive gene-phenotype evidence portrayal ("MiniCards") and hyperlinks to the parent databases. CONCLUSIONS: We demonstrate that VarElect compares favorably with several often-used NGS phenotyping tools, thus providing a robust facility for ranking genes, pointing out their likelihood to be related to a patient's disease. VarElect's capacity to automatically process numerous NGS cases, either in stand-alone format or in VCF-analyzer mode (TGex and VarAnnot), is indispensable for emerging clinical projects that involve thousands of whole exome/genome NGS analyses.

Assuntos

Biologia Computacional/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Algoritmos , Mineração de Dados , Bases de Dados Genéticas , Genoma Humano , Humanos , Fenótipo

Expanding and Enriching the LncRNA Gene-Disease Landscape Using the GeneCaRNA Database.

Aggarwal, Shalini; Rosenblum, Chana; Gould, Marshall; Ziman, Shahar; Barshir, Ruth; Zelig, Ofer; Guan-Golan, Yaron; Iny-Stein, Tsippi; Safran, Marilyn; Pietrokovski, Shmuel; Lancet, Doron.

Biomedicines ; 12(6)2024 Jun 12.

Artigo em Inglês | MEDLINE | ID: mdl-38927512

RESUMO

The GeneCaRNA human gene database is a member of the GeneCards Suite. It presents ~280,000 human non-coding RNA genes, identified algorithmically from ~690,000 RNAcentral transcripts. This expands by ~tenfold the ncRNA gene count relative to other sources. GeneCaRNA thus contains ~120,000 long non-coding RNAs (LncRNAs, >200 bases long), including ~100,000 novel genes. The latter have sparse functional information, a vast terra incognita for future research. LncRNA genes are uniformly represented on all nuclear chromosomes, with 10 genes on mitochondrial DNA. Data obtained from MalaCards, another GeneCards Suite member, finds 1547 genes associated with 1 to 50 diseases. About 15% of the associations portray experimental evidence, with cancers tending to be multigenic. Preliminary text mining within GeneCaRNA discovers interactions of lncRNA transcripts with target gene products, with 25% being ncRNAs and 75% proteins. GeneCaRNA has a biological pathways section, which at present shows 131 pathways for 38 lncRNA genes, a basis for future expansion. Finally, our GeneHancer database provides regulatory elements for ~110,000 lncRNA genes, offering pointers for co-regulated genes and genetic linkages from enhancers to diseases. We anticipate that the broad vista provided by GeneCaRNA will serve as an essential guide for further lncRNA research in disease decipherment.

GeneCaRNA: A Comprehensive Gene-centric Database of Human Non-coding RNAs in the GeneCards Suite.

Barshir, Ruth; Fishilevich, Simon; Iny-Stein, Tsippi; Zelig, Ofer; Mazor, Yaron; Guan-Golan, Yaron; Safran, Marilyn; Lancet, Doron.

J Mol Biol ; 433(11): 166913, 2021 05 28.

Artigo em Inglês | MEDLINE | ID: mdl-33676929

RESUMO

Non-coding RNA (ncRNA) genes assume increasing biological importance, with growing associations with diseases. Many ncRNA sources are transcript-centric, but for non-coding variant analysis and disease decipherment it is essential to transform this information into a comprehensive set of genome-mapped ncRNA genes. We present GeneCaRNA, a new all-inclusive gene-centric ncRNA database within the GeneCards Suite. GeneCaRNA information is integrated from four community-backed data structures: the major transcript database RNAcentral with its 20 encompassed databases, and the ncRNA entries of three major gene resources HGNC, Ensembl and NCBI Gene. GeneCaRNA presents 219,587 ncRNA gene pages, a 7-fold increase from those available in our three gene mining sources. Each ncRNA gene has wide-ranging annotation, mined from >100 worldwide sources, providing a powerful GeneCards-leveraged search. The latter empowers VarElect, our disease-gene interpretation tool, allowing one to systematically decipher ncRNA variants. The combined power of GeneCaRNA with GeneHancer, our regulatory elements database, facilitates wide-ranging scrutiny of the non-coding terra incognita of gene networks and whole genome analyses.

Assuntos

Bases de Dados de Ácidos Nucleicos , Genes , RNA não Traduzido/genética , Software , Redes Reguladoras de Genes , Predisposição Genética para Doença , Humanos

The GeneCards Suite: From Gene Data Mining to Disease Genome Sequence Analyses.

Stelzer, Gil; Rosen, Naomi; Plaschkes, Inbar; Zimmerman, Shahar; Twik, Michal; Fishilevich, Simon; Stein, Tsippi Iny; Nudel, Ron; Lieder, Iris; Mazor, Yaron; Kaplan, Sergey; Dahary, Dvir; Warshawsky, David; Guan-Golan, Yaron; Kohn, Asher; Rappaport, Noa; Safran, Marilyn; Lancet, Doron.

Curr Protoc Bioinformatics ; 54: 1.30.1-1.30.33, 2016 06 20.

Artigo em Inglês | MEDLINE | ID: mdl-27322403

RESUMO

GeneCards, the human gene compendium, enables researchers to effectively navigate and inter-relate the wide universe of human genes, diseases, variants, proteins, cells, and biological pathways. Our recently launched Version 4 has a revamped infrastructure facilitating faster data updates, better-targeted data queries, and friendlier user experience. It also provides a stronger foundation for the GeneCards suite of companion databases and analysis tools. Improved data unification includes gene-disease links via MalaCards and merged biological pathways via PathCards, as well as drug information and proteome expression. VarElect, another suite member, is a phenotype prioritizer for next-generation sequencing, leveraging the GeneCards and MalaCards knowledgebase. It automatically infers direct and indirect scored associations between hundreds or even thousands of variant-containing genes and disease phenotype terms. VarElect's capabilities, either independently or within TGex, our comprehensive variant analysis pipeline, help prepare for the challenge of clinical projects that involve thousands of exome/genome NGS analyses. © 2016 by John Wiley & Sons, Inc.

Assuntos

Mineração de Dados/métodos , Bases de Dados Genéticas , Genômica/métodos , Análise de Sequência/métodos , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Fenótipo , Proteoma , Software/normas

GeneAnalytics: An Integrative Gene Set Analysis Tool for Next Generation Sequencing, RNAseq and Microarray Data.

Ben-Ari Fuchs, Shani; Lieder, Iris; Stelzer, Gil; Mazor, Yaron; Buzhor, Ella; Kaplan, Sergey; Bogoch, Yoel; Plaschkes, Inbar; Shitrit, Alina; Rappaport, Noa; Kohn, Asher; Edgar, Ron; Shenhav, Liraz; Safran, Marilyn; Lancet, Doron; Guan-Golan, Yaron; Warshawsky, David; Shtrichman, Ronit.

OMICS ; 20(3): 139-51, 2016 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-26983021

RESUMO

Postgenomics data are produced in large volumes by life sciences and clinical applications of novel omics diagnostics and therapeutics for precision medicine. To move from "data-to-knowledge-to-innovation," a crucial missing step in the current era is, however, our limited understanding of biological and clinical contexts associated with data. Prominent among the emerging remedies to this challenge are the gene set enrichment tools. This study reports on GeneAnalytics™ ( geneanalytics.genecards.org ), a comprehensive and easy-to-apply gene set analysis tool for rapid contextualization of expression patterns and functional signatures embedded in the postgenomics Big Data domains, such as Next Generation Sequencing (NGS), RNAseq, and microarray experiments. GeneAnalytics' differentiating features include in-depth evidence-based scoring algorithms, an intuitive user interface and proprietary unified data. GeneAnalytics employs the LifeMap Science's GeneCards suite, including the GeneCards®--the human gene database; the MalaCards-the human diseases database; and the PathCards--the biological pathways database. Expression-based analysis in GeneAnalytics relies on the LifeMap Discovery®--the embryonic development and stem cells database, which includes manually curated expression data for normal and diseased tissues, enabling advanced matching algorithm for gene-tissue association. This assists in evaluating differentiation protocols and discovering biomarkers for tissues and cells. Results are directly linked to gene, disease, or cell "cards" in the GeneCards suite. Future developments aim to enhance the GeneAnalytics algorithm as well as visualizations, employing varied graphical display items. Such attributes make GeneAnalytics a broadly applicable postgenomics data analyses and interpretation tool for translation of data to knowledge-based innovation in various Big Data fields such as precision medicine, ecogenomics, nutrigenomics, pharmacogenomics, vaccinomics, and others yet to emerge on the postgenomics horizon.

Assuntos

Biologia Computacional/métodos , Redes Reguladoras de Genes , Genoma Humano , Sequenciamento de Nucleotídeos em Larga Escala/estatística & dados numéricos , Análise em Microsséries/estatística & dados numéricos , Software , Algoritmos , Mineração de Dados , Bases de Dados Factuais , Bases de Dados Genéticas , Humanos , Redes e Vias Metabólicas/genética

MalaCards: an integrated compendium for diseases and their annotation.

Rappaport, Noa; Nativ, Noam; Stelzer, Gil; Twik, Michal; Guan-Golan, Yaron; Stein, Tsippi Iny; Bahir, Iris; Belinky, Frida; Morrey, C Paul; Safran, Marilyn; Lancet, Doron.

Database (Oxford) ; 2013: bat018, 2013.

Artigo em Inglês | MEDLINE | ID: mdl-23584832

RESUMO

Comprehensive disease classification, integration and annotation are crucial for biomedical discovery. At present, disease compilation is incomplete, heterogeneous and often lacking systematic inquiry mechanisms. We introduce MalaCards, an integrated database of human maladies and their annotations, modeled on the architecture and strategy of the GeneCards database of human genes. MalaCards mines and merges 44 data sources to generate a computerized card for each of 16 919 human diseases. Each MalaCard contains disease-specific prioritized annotations, as well as inter-disease connections, empowered by the GeneCards relational database, its searches and GeneDecks set analyses. First, we generate a disease list from 15 ranked sources, using disease-name unification heuristics. Next, we use four schemes to populate MalaCards sections: (i) directly interrogating disease resources, to establish integrated disease names, synonyms, summaries, drugs/therapeutics, clinical features, genetic tests and anatomical context; (ii) searching GeneCards for related publications, and for associated genes with corresponding relevance scores; (iii) analyzing disease-associated gene sets in GeneDecks to yield affiliated pathways, phenotypes, compounds and GO terms, sorted by a composite relevance score and presented with GeneCards links; and (iv) searching within MalaCards itself, e.g. for additional related diseases and anatomical context. The latter forms the basis for the construction of a disease network, based on shared MalaCards annotations, embodying associations based on etiology, clinical features and clinical conditions. This broadly disposed network has a power-law degree distribution, suggesting that this might be an inherent property of such networks. Work in progress includes hierarchical malady classification, ontological mapping and disease set analyses, striving to make MalaCards an even more effective tool for biomedical research. Database URL: http://www.malacards.org/

Assuntos

Bases de Dados Genéticas , Doença/genética , Anotação de Sequência Molecular , Mineração de Dados , Humanos , Internet

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

Detalhe da pesquisa