Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 3 de 3
Filtrar
1.
BMC Med Inform Decis Mak ; 19(1): 32, 2019 02 14.
Artículo en Inglés | MEDLINE | ID: mdl-30764825

RESUMEN

BACKGROUND: Existing resources to assist the diagnosis of rare diseases are usually curated from the literature that can be limited for clinical use. It often takes substantial effort before the suspicion of a rare disease is even raised to utilize those resources. The primary goal of this study was to apply a data-driven approach to enrich existing rare disease resources by mining phenotype-disease associations from electronic medical record (EMR). METHODS: We first applied association rule mining algorithms on EMR to extract significant phenotype-disease associations and enriched existing rare disease resources (Human Phenotype Ontology and Orphanet (HPO-Orphanet)). We generated phenotype-disease bipartite graphs for HPO-Orphanet, EMR, and enriched knowledge base HPO-Orphanet + and conducted a case study on Hodgkin lymphoma to compare performance on differential diagnosis among these three graphs. RESULTS: We used disease-disease similarity generated by the eRAM, an existing rare disease encyclopedia, as a gold standard to compare the three graphs with sensitivity and specificity as (0.17, 0.36, 0.46) and (0.52, 0.47, 0.51) for three graphs respectively. We also compared the top 15 diseases generated by the HPO-Orphanet + graph with eRAM and another clinical diagnostic tool, the Phenomizer. CONCLUSIONS: Per our evaluation results, our approach was able to enrich existing rare disease knowledge resources with phenotype-disease associations from EMR and thus support rare disease differential diagnosis.


Asunto(s)
Algoritmos , Minería de Datos , Registros Electrónicos de Salud , Bases del Conocimiento , Enfermedades Raras , Humanos , Fenotipo , Enfermedades Raras/diagnóstico
2.
AMIA Annu Symp Proc ; 2018: 574-583, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-30815098

RESUMEN

Manually annotated clinical corpora are commonly used as the gold standards for the training and evaluation of clinical natural language processing (NLP) tools. The creation of these manual annotation corpora, however, is both costly and time-consuming. There is an emerging need in the clinical NLP community for reusing existing annotation corpora across different clinical NLP tasks. The objective of this study is to design, develop and evaluate a framework and accompanying tools to support the standardization and integration of annotation corpora using the HL7 Fast Healthcare Interoperability Resources (FHIR) specification. The framework contains two main modules: 1) an automatic schema transformation module, in which the annotation schema in each corpus is automatically transformed into the FHIR-based schema; 2) an expert-based verification and annotation module, in which existing annotations can be verified and new annotations can be added for new elements defined in FHIR. We evaluated the framework using various annotation corpora created as part of different clinical NLP projects at the Mayo Clinic. We demonstrated that it is feasible to leverage FHIR as a standard data model for standardizing heterogeneous annotation corpora for their reuse and integration in advanced clinical NLP research and practices.


Asunto(s)
Registros Electrónicos de Salud/normas , Interoperabilidad de la Información en Salud/normas , Estándar HL7 , Procesamiento de Lenguaje Natural , Estudios de Factibilidad , Humanos
3.
PLoS One ; 13(1): e0191568, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-29373609

RESUMEN

Recent scientific advances have accumulated a tremendous amount of biomedical knowledge providing novel insights into the relationship between molecular and cellular processes and diseases. Literature mining is one of the commonly used methods to retrieve and extract information from scientific publications for understanding these associations. However, due to large data volume and complicated associations with noises, the interpretability of such association data for semantic knowledge discovery is challenging. In this study, we describe an integrative computational framework aiming to expedite the discovery of latent disease mechanisms by dissecting 146,245 disease-gene associations from over 25 million of PubMed indexed articles. We take advantage of both Latent Dirichlet Allocation (LDA) modeling and network-based analysis for their capabilities of detecting latent associations and reducing noises for large volume data respectively. Our results demonstrate that (1) the LDA-based modeling is able to group similar diseases into disease topics; (2) the disease-specific association networks follow the scale-free network property; (3) certain subnetwork patterns were enriched in the disease-specific association networks; and (4) genes were enriched in topic-specific biological processes. Our approach offers promising opportunities for latent disease-gene knowledge discovery in biomedical research.


Asunto(s)
Predisposición Genética a la Enfermedad , PubMed , Minería de Datos , Humanos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA