Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
1.
J Am Med Inform Assoc ; 16(3): 305-15, 2009.
Artículo en Inglés | MEDLINE | ID: mdl-19261933

RESUMEN

Many biomedical terminologies, classifications, and ontological resources such as the NCI Thesaurus (NCIT), International Classification of Diseases (ICD), Systematized Nomenclature of Medicine (SNOMED), Current Procedural Terminology (CPT), and Gene Ontology (GO) have been developed and used to build a variety of IT applications in biology, biomedicine, and health care settings. However, virtually all these resources involve incompatible formats, are based on different modeling languages, and lack appropriate tooling and programming interfaces (APIs) that hinder their wide-scale adoption and usage in a variety of application contexts. The Lexical Grid (LexGrid) project introduced in this paper is an ongoing community-driven initiative, coordinated by the Mayo Clinic Division of Biomedical Statistics and Informatics, designed to bridge this gap using a common terminology model called the LexGrid model. The key aspect of the model is to accommodate multiple vocabulary and ontology distribution formats and support of multiple data stores for federated vocabulary distribution. The model provides a foundation for building consistent and standardized APIs to access multiple vocabularies that support lexical search queries, hierarchy navigation, and a rich set of features such as recursive subsumption (e.g., get all the children of the concept penicillin). Existing LexGrid implementations include the LexBIG API as well as a reference implementation of the HL7 Common Terminology Services (CTS) specification providing programmatic access via Java, Web, and Grid services.


Asunto(s)
Almacenamiento y Recuperación de la Información/métodos , Sistemas de Información/normas , Programas Informáticos , Vocabulario Controlado , Almacenamiento y Recuperación de la Información/normas , Modelos Teóricos , Integración de Sistemas
2.
J Am Med Inform Assoc ; 15(1): 25-8, 2008.
Artículo en Inglés | MEDLINE | ID: mdl-17947622

RESUMEN

This article describes our system entry for the 2006 I2B2 contest "Challenges in Natural Language Processing for Clinical Data" for the task of identifying the smoking status of patients. Our system makes the simplifying assumption that patient-level smoking status determination can be achieved by accurately classifying individual sentences from a patient's record. We created our system with reusable text analysis components built on the Unstructured Information Management Architecture and Weka. This reuse of code minimized the development effort related specifically to our smoking status classifier. We report precision, recall, F-score, and 95% exact confidence intervals for each metric. Recasting the classification task for the sentence level and reusing code from other text analysis projects allowed us to quickly build a classification system that performs with a system F-score of 92.64 based on held-out data tests and of 85.57 on the formal evaluation data. Our general medical natural language engine is easily adaptable to a real-world medical informatics application. Some of the limitations as applied to the use-case are negation detection and temporal resolution.


Asunto(s)
Clasificación/métodos , Sistemas de Registros Médicos Computarizados , Procesamiento de Lenguaje Natural , Fumar , Bases de Datos Factuales , Humanos
3.
J Am Med Inform Assoc ; 13(5): 516-25, 2006.
Artículo en Inglés | MEDLINE | ID: mdl-16799125

RESUMEN

OBJECTIVE: Human classification of diagnoses is a labor intensive process that consumes significant resources. Most medical practices use specially trained medical coders to categorize diagnoses for billing and research purposes. METHODS: We have developed an automated coding system designed to assign codes to clinical diagnoses. The system uses the notion of certainty to recommend subsequent processing. Codes with the highest certainty are generated by matching the diagnostic text to frequent examples in a database of 22 million manually coded entries. These code assignments are not subject to subsequent manual review. Codes at a lower certainty level are assigned by matching to previously infrequently coded examples. The least certain codes are generated by a naïve Bayes classifier. The latter two types of codes are subsequently manually reviewed. MEASUREMENTS: Standard information retrieval accuracy measurements of precision, recall and f-measure were used. Micro- and macro-averaged results were computed. RESULTS At least 48% of all EMR problem list entries at the Mayo Clinic can be automatically classified with macro-averaged 98.0% precision, 98.3% recall and an f-score of 98.2%. An additional 34% of the entries are classified with macro-averaged 90.1% precision, 95.6% recall and 93.1% f-score. The remaining 18% of the entries are classified with macro-averaged 58.5%. CONCLUSION: Over two thirds of all diagnoses are coded automatically with high accuracy. The system has been successfully implemented at the Mayo Clinic, which resulted in a reduction of staff engaged in manual coding from thirty-four coders to seven verifiers.


Asunto(s)
Indización y Redacción de Resúmenes/métodos , Inteligencia Artificial , Enfermedad/clasificación , Control de Formularios y Registros/métodos , Procesamiento de Lenguaje Natural , Humanos , Clasificación Internacional de Enfermedades , Sistemas de Registros Médicos Computarizados , Proyectos Piloto , Interfaz Usuario-Computador
4.
Biomed Inform Insights ; 8(Suppl 1): 13-22, 2016.
Artículo en Inglés | MEDLINE | ID: mdl-27385912

RESUMEN

The concept of optimizing health care by understanding and generating knowledge from previous evidence, ie, the Learning Health-care System (LHS), has gained momentum and now has national prominence. Meanwhile, the rapid adoption of electronic health records (EHRs) enables the data collection required to form the basis for facilitating LHS. A prerequisite for using EHR data within the LHS is an infrastructure that enables access to EHR data longitudinally for health-care analytics and real time for knowledge delivery. Additionally, significant clinical information is embedded in the free text, making natural language processing (NLP) an essential component in implementing an LHS. Herein, we share our institutional implementation of a big data-empowered clinical NLP infrastructure, which not only enables health-care analytics but also has real-time NLP processing capability. The infrastructure has been utilized for multiple institutional projects including the MayoExpertAdvisor, an individualized care recommendation solution for clinical care. We compared the advantages of big data over two other environments. Big data infrastructure significantly outperformed other infrastructure in terms of computing speed, demonstrating its value in making the LHS a possibility in the near future.

5.
Stud Health Technol Inform ; 107(Pt 1): 411-5, 2004.
Artículo en Inglés | MEDLINE | ID: mdl-15360845

RESUMEN

Classification of diagnoses (a.k.a. coding) is the central part of current concept based medical IR systems. Some classification systems contain over 30,000 distinct codes which makes classifying clinical documents a time consuming labor intensive and error prone process. This paper presents a simple methodology for cleaning up and reusing existing manually coded diagnostic statements mainly extracted from clinical notes to build predictive models using a sparse-feature implementation of a Naïve Bayes classifier. One of the problems addressed is that diagnostic statements often contain several diagnoses and are assigned several codes resulting in a multi-class classification problem. We investigate one possible way of addressing this problem by introducing compound (multiple code) categories. We present experimental results of classifying >16,000 randomly selected diagnostic strings into 19 top level categories. A small improvement (3%) with using compound categories over simple categories indicates that using multiple code categories is a promising solution, although clearly in need of further research and refinement.


Asunto(s)
Indización y Redacción de Resúmenes , Medicina Clínica/clasificación , Diagnóstico , Control de Formularios y Registros , Algoritmos , Teorema de Bayes , Humanos
6.
J Ambul Care Manage ; 37(3): 206-10, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-24887521

RESUMEN

The electronic medical record has evolved from a digital representation of individual patient results and documents to information of large scale and complexity. Big Data refers to new technologies providing management and processing capabilities, targeting massive and disparate data sets. For an individual patient, techniques such as Natural Language Processing allow the integration and analysis of textual reports with structured results. For groups of patients, Big Data offers the promise of large-scale analysis of outcomes, patterns, temporal trends, and correlations. The evolution of Big Data analytics moves us from description and reporting to forecasting, predictive modeling, and decision optimization.


Asunto(s)
Sistemas de Apoyo a Decisiones Clínicas/estadística & datos numéricos , Registros Electrónicos de Salud/estadística & datos numéricos , Medicina Basada en la Evidencia/estadística & datos numéricos , Procesamiento de Lenguaje Natural , Interpretación Estadística de Datos , Sistemas de Apoyo a Decisiones Clínicas/organización & administración , Registros Electrónicos de Salud/organización & administración , Registros Electrónicos de Salud/normas , Medicina Basada en la Evidencia/métodos , Predicción/métodos , Humanos , Difusión de la Información/métodos
7.
AMIA Annu Symp Proc ; : 556-60, 2008 Nov 06.
Artículo en Inglés | MEDLINE | ID: mdl-18998955

RESUMEN

The ability to model, share and re-use value sets across medical information systems is an important requirement. However, generating value sets semi-automatically from a terminology service is an unresolved issue, in part due to the lack of linkage to clinical context patterns that provide the constraints in defining a concept domain and invocation of value sets extraction. Towards this goal, we develop and evaluate an approach for context-driven automatic value sets extraction based on a formal terminology model. The crux of the technique is to identify and define the context patterns from various domains of discourse and leverage them for value set extraction using two complementary ideas based on (i) local terms provided by the Subject Matter Experts (extensional) and (ii) Semantic definition of the concepts in coding schemes (intensional). A prototype was implemented based on SNOMED CT rendered in the LexGrid terminology model and a preliminary evaluation is presented.


Asunto(s)
Inteligencia Artificial , Almacenamiento y Recuperación de la Información/métodos , Sistemas de Registros Médicos Computarizados , Procesamiento de Lenguaje Natural , Reconocimiento de Normas Patrones Automatizadas/métodos , Descriptores , Algoritmos , Estados Unidos
8.
Proc IEEE Int Conf Semant Comput ; 2008: 460-467, 2008 Aug 12.
Artículo en Inglés | MEDLINE | ID: mdl-21625412

RESUMEN

The ability to model, share and re-use value sets across multiple medical information systems is an important requirement. However, generating value sets semi-automatically from a terminology service is still an unresolved issue, in part due to the lack of linkage to clinical context patterns that provide the constraints in defining a concept domain and invocation of value sets extraction. Towards this goal, we develop and evaluate an approach for context-driven automatic value sets extraction based on a formal terminology model. The crux of the technique is to identify and define the context patterns from various domains of discourse and leverage them for value set extraction using two complementary ideas based on (i) local terms provided by the subject matter experts (extensional) and (ii) semantic definition of the concepts in coding schemes (intensional). We develop algorithms based on well-studied graph traversal and ontology segmentation techniques for both the approaches and implement a prototype demonstrating their applicability on use cases from, SNOMED CT rendered, in the LexGrid terminology model. We also present preliminary evaluation of our approach and report investigation results done by subject matter experts at the Mayo Clinic.

9.
AMIA Annu Symp Proc ; : 1050, 2006.
Artículo en Inglés | MEDLINE | ID: mdl-17238668

RESUMEN

We report efforts from a pilot annotation project in which we annotated sixty clinical notes with disorder mentions. Four retrieval experts annotated the same clinical notes so that four-way inter-annotator agreement could be calculated. We find the inter-annotator agreement results encouraging to scale up the project using the same guidelines and annotation schema.


Asunto(s)
Procesamiento de Lenguaje Natural , Humanos , Registros Médicos , Proyectos Piloto , Unified Medical Language System
10.
Proc AMIA Symp ; : 81-5, 2002.
Artículo en Inglés | MEDLINE | ID: mdl-12463791

RESUMEN

To understand if unmediated services could serve the data retrieval needs for the Mayo research investigator, a study was conducted to determine researcher interest, ability, and outcome of using a clinical data retrieval system. The results indicate about 25% of the research investigators would use a self-service retrieval tool. However, there is clear evidence a majority of the research investigators are satisfied with and prefer the mediated service because of convenience, retrieval specialist knowledge, and lack of time to perform the search themselves. Approximately 61% of the non-participants indicated they would be willing to pay a fee for continued use of the mediated service. This study confirms the interest in self-service retrieval tools, but the actual interest is lower than anticipated. The recommendation is to continue the use of mediated services and to offer self-service methods as needed, allowing the most options to the research investigator.


Asunto(s)
Comportamiento del Consumidor/estadística & datos numéricos , Almacenamiento y Recuperación de la Información/métodos , Sistemas de Registros Médicos Computarizados , Investigación Biomédica , Medicina Clínica , Recolección de Datos , Humanos , Bibliotecólogos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA