Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
1.
RNA Biol ; 17(7): 943-955, 2020 07.
Artículo en Inglés | MEDLINE | ID: mdl-32122231

RESUMEN

Noncoding RNAs (ncRNAs) play critical roles in many critical biological processes and have become a novel class of potential targets and bio-markers for disease diagnosis, therapy, and prognosis. Annotating and analysing ncRNA-disease association data are essential but challenging. Current computational resources lack comprehensive database platforms to consistently interpret and prioritize ncRNA-disease association data for biomedical investigation and application. Here, we present the ncRPheno database platform (http://lilab2.sysu.edu.cn/ncrpheno), which comprehensively integrates and annotates ncRNA-disease association data and provides novel searches, visualizations, and utilities for association identification and validation. ncRPheno contains 482,751 non-redundant associations between 14,494 ncRNAs and 3,210 disease phenotypes across 11 species with supporting evidence in the literature. A scoring model was refined to prioritize the associations based on evidential metrics. Moreover, ncRPheno provides user-friendly web interfaces, novel visualizations, and programmatic access to enable easy exploration, analysis, and utilization of the association data. A case study through ncRPheno demonstrated a comprehensive landscape of ncRNAs dysregulation associated with 22 cancers and uncovered 821 cancer-associated common ncRNAs. As a unique database platform, ncRPheno outperforms the existing similar databases in terms of data coverage and utilities, and it will assist studies in encoding ncRNAs associated with phenotypes ranging from genetic disorders to complex diseases. ABBREVIATIONS: APIs: application programming interfaces; circRNA: circular RNA; ECO: Evidence & Conclusion Ontology; EFO: Experimental Factor Ontology; FDR: false discovery rate; GO: Gene Ontology; GWAS: genome wide association studies; HPO: Human Phenotype Ontology; ICGC: International Cancer Genome Consortium; lncRNA: long noncoding RNA; miRNA: micro RNA; ncRNA: noncoding RNA; NGS: next generation sequencing; OMIM: Online Mendelian Inheritance in Man; piRNA: piwi-interacting RNA; snoRNA: small nucleolar RNA; TCGA: The Cancer Genome Atlas.


Asunto(s)
Bases de Datos Genéticas , Predisposición Genética a la Enfermedad , ARN no Traducido/genética , Algoritmos , Ontología de Genes , Estudio de Asociación del Genoma Completo , Humanos , MicroARNs , Modelos Teóricos , Fenotipo , ARN Circular , ARN Largo no Codificante , Interfaz Usuario-Computador , Navegador Web
2.
J Mol Biol ; 433(11): 166727, 2021 05 28.
Artículo en Inglés | MEDLINE | ID: mdl-33275967

RESUMEN

While variants of noncoding RNAs (ncRNAs) have been experimentally validated as a new class of biomarkers and drug targets, the discovery and interpretation of relationships between ncRNA variants and human diseases become important and challenging. Here we present ncRNAVar (http://www.liwzlab.cn/ncrnavar/), the first database that provides association data between validated ncRNA variants and human diseases through manual curation on 2650 publications and computational annotation. ncRNAVar contains 4565 associations between 711 human disease phenotypes and 3112 variants from 2597 ncRNAs. Each association was reviewed by professional curators, incorporated with valuable annotation and cross references, and designated with an association score by our refined score model. ncRNAVar offers web applications including association prioritization, network visualization, and relationship mapping. ncRNAVar, presenting a landscape of ncRNA variants in human diseases and a useful resource for subsequent software development, will improve our insight of relationships between ncRNA variants and human health.


Asunto(s)
Bases de Datos de Ácidos Nucleicos , Enfermedad/genética , Predisposición Genética a la Enfermedad , Variación Genética , ARN no Traducido/genética , Humanos , Internet , Fenotipo
3.
Genomics Proteomics Bioinformatics ; 18(6): 760-772, 2020 12.
Artículo en Inglés | MEDLINE | ID: mdl-33418085

RESUMEN

Microbes play important roles in human health and disease. The interaction between microbes and hosts is a reciprocal relationship, which remains largely under-explored. Current computational resources lack manually and consistently curated data to connect metagenomic data to pathogenic microbes, microbial core genes, and disease phenotypes. We developed the MicroPhenoDB database by manually curating and consistently integrating microbe-disease association data. MicroPhenoDB provides 5677 non-redundant associations between 1781 microbes and 542 human disease phenotypes across more than 22 human body sites. MicroPhenoDB also provides 696,934 relationships between 27,277 unique clade-specific core genes and 685 microbes. Disease phenotypes are classified and described using the Experimental Factor Ontology (EFO). A refined score model was developed to prioritize the associations based on evidential metrics. The sequence search option in MicroPhenoDB enables rapid identification of existing pathogenic microbes in samples without running the usual metagenomic data processing and assembly. MicroPhenoDB offers data browsing, searching, and visualization through user-friendly web interfaces and web service application programming interfaces. MicroPhenoDB is the first database platform to detail the relationships between pathogenic microbes, core genes, and disease phenotypes. It will accelerate metagenomic data analysis and assist studies in decoding microbes related to human diseases. MicroPhenoDB is available through http://www.liwzlab.cn/microphenodb and http://lilab2.sysu.edu.cn/microphenodb.


Asunto(s)
Metagenoma , Metagenómica , Genes Microbianos , Humanos , Fenotipo , Programas Informáticos
4.
Database (Oxford) ; 20192019 01 01.
Artículo en Inglés | MEDLINE | ID: mdl-31317184

RESUMEN

Iterative homology search has been widely used in identification of remotely related proteins. Our previous study has found that the query-seeded sequence iterative search can reduce homologous over-extension errors and greatly improve selectivity. However, iterative homology search remains challenging in protein functional prediction. More sensitive scoring models are highly needed to improve the predictive performance of the alignment methods, and alignment annotation with better visualization has also become imperative for result interpretation. Here we report an open-source application PSISearch2D that runs query-seeded iterative sequence search for remotely related protein detection. PSISearch2D retrieves domain annotation from Pfam, UniProtKB, CDD and PROSITE for resulting hits and demonstrates combined domain and sequence alignments in novel visualizations. A scoring model called C-value is newly defined to re-order hits with consideration of the combination of sequence and domain alignments. The benchmarking on the use of C-value indicates that PSISearch2D outperforms the original PSISearch2 tool in terms of both accuracy and specificity. PSISearch2D improves the characterization of unknown proteins in remote protein detection. Our evaluation tests show that PSISearch2D has provided annotation for 77 695 of 139 503 unknown bacteria proteins and 140 751 of 352 757 unknown virus proteins in UniProtKB, about 2.3-fold and 1.8-fold more characterization than the original PSISearch2, respectively. Together with advanced features of auto-iteration mode to handle large-scale data and optional programs for global and local sequence alignments, PSISearch2D enhances remotely related protein search.


Asunto(s)
Algoritmos , Bases de Datos de Proteínas , Proteínas , Alineación de Secuencia , Análisis de Secuencia de Proteína , Modelos Moleculares , Proteínas/química , Proteínas/genética
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA